RAG system for Cube.js analytics with vector search capabilities

These details have not been verified by PyPI

Project description

Cube-to-RAG

Natural language interface for Cube.js analytics using RAG (Retrieval-Augmented Generation) with vector search and LLM agents.

Features

Semantic Schema Search: Discover cubes, dimensions, and measures using natural language
GraphQL Query Generation: Automatically build and execute Cube.js GraphQL queries
Vector Database: Store and search schema embeddings in Milvus
Multiple LLM Providers: Support for OpenAI, Anthropic Claude, and AWS Bedrock
Streaming Responses: Real-time token-by-token streaming for better UX
FastAPI Backend: Modern async Python web framework
Jinja2 Templates: Flexible schema formatting

Architecture

User Question
     ↓
1. Semantic Search (Milvus)
   └→ Find relevant cubes, dimensions, measures
     ↓
2. GraphQL Query Construction
   └→ Build query using discovered schema
     ↓
3. Execute Query (Cube.js GraphQL API)
   └→ http://cube_api:4000/cubejs-api/graphql
     ↓
4. Format & Explain Results
   └→ Natural language response

Installation

Using pip

pip install cube-to-rag

Using Poetry

poetry add cube-to-rag

From source

git clone https://github.com/yourusername/dbt-to-cube.git
cd cube-to-rag
poetry install

Configuration

Create a .env file with your configuration:

# LLM Configuration
LLM_MODEL_ID='anthropic:claude-3-5-sonnet-20241022'  # or 'openai:gpt-4' or 'bedrock:...'
EMBEDDING_MODEL='openai:text-embedding-3-small'  # or 'bedrock:amazon.titan-embed-text-v2:0'

# API Keys (choose based on your LLM/embedding provider)
ANTHROPIC_API_KEY='sk-ant-your-key-here'
OPENAI_API_KEY='sk-your-key-here'

# AWS Bedrock (if using Bedrock models)
AWS_ACCESS_KEY_ID='your-aws-key'
AWS_SECRET_ACCESS_KEY='your-aws-secret'
AWS_DEFAULT_REGION='us-east-1'

# Cube.js Configuration
CUBE_URL='http://cube_api:4000'  # Base URL (GraphQL and REST APIs are auto-constructed)
# Optional: Cube.js API secret (JWT tokens are auto-generated)
# CUBEJS_API_SECRET='your-cubejs-api-secret'

# Milvus Vector Database
MILVUS_SERVER_URI='http://localhost:19530'
# Optional: Milvus authentication (for secured instances)
# MILVUS_USER='root'
# MILVUS_PASSWORD='your-milvus-password'

# Security
SECRET_KEY='your-secret-key-here'
FAST_API_ACCESS_SECRET_TOKEN='your-access-token'
DEPLOY_ENV='local'  # or 'prod'

Quick Start

1. Start Required Services

You'll need:

Cube.js running with GraphQL API enabled
Milvus vector database

Using Docker Compose:

docker-compose -f docker-compose.yml -f docker-compose.milvus.yml -f docker-compose.ai.yml up -d

2. Run the API Server

# Using uvicorn directly
uvicorn app.server.main:app --host 0.0.0.0 --port 8080

# Or using the installed package
python -m cube_to_rag

3. Ingest Cube.js Schemas

The API will automatically fetch and ingest schemas from Cube.js on startup. You can also trigger manual ingestion:

curl -X POST http://localhost:8080/embeddings/ingest \
  -H "Content-Type: application/json"

Using in Your Own Applications

The cube-to-rag package provides tools and utilities you can add to your existing LangChain agents and applications.

Installation

pip install cube-to-rag
# or
poetry add cube-to-rag

1. Add Cube.js Extraction Tool to Your Agent

Add Cube.js querying capabilities to any existing LangChain agent:

from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.agent_toolkits.load_tools import load_tools

from cube_to_rag.core.llm import get_llm
from cube_to_rag.tools import get_cube_schema_search_tool

# Your existing tools
your_existing_tools = [...]

# Add Cube.js extraction tools
cube_schema_search = get_cube_schema_search_tool(k=3)

# Use LangChain's built-in GraphQL tool for queries
graphql_tools = load_tools(
    ["graphql"],
    graphql_endpoint="http://localhost:4000/cubejs-api/graphql",
    llm=get_llm()
)

# Combine all tools
tools = your_existing_tools + [cube_schema_search] + graphql_tools

# Create agent with all tools
llm = get_llm()
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with access to analytics data."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# Now your agent can answer questions like:
result = agent_executor.invoke({
    "input": "What's the average course GPA by department?"
})

2. Add Documents to Milvus Vector Database

Easily embed and store your documents in Milvus:

from langchain_core.documents import Document
from cube_to_rag.tools import create_milvus_helper

# Create helper for your collection
helper = create_milvus_helper(collection_name="my_documents")

# Add documents
docs = [
    Document(
        page_content="Your document content here",
        metadata={"source": "doc1.pdf", "page": 1}
    ),
    Document(
        page_content="More content",
        metadata={"source": "doc2.pdf", "page": 1}
    )
]

ids = helper.add_documents(docs)
print(f"Added {len(ids)} documents to Milvus")

# Or add text directly
texts = ["First document", "Second document"]
metadatas = [{"source": "text1"}, {"source": "text2"}]

ids = helper.add_texts(texts, metadatas)

3. Add Vector Search Tool to Your Agent

Add semantic document search to any LangChain agent:

from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate

from cube_to_rag.core.llm import get_llm
from cube_to_rag.tools import get_vector_search_tool

# Your existing tools
your_existing_tools = [...]

# Add vector search tool
vector_search = get_vector_search_tool(
    collection_name="my_documents",
    k=5
)

# Combine tools
tools = your_existing_tools + [vector_search]

# Create agent
llm = get_llm()
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with access to document search."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

# Now your agent can search documents
result = agent_executor.invoke({
    "input": "Find documents about machine learning"
})

4. Complete Example: Agent with All Tools

from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.agent_toolkits.load_tools import load_tools

from cube_to_rag.core.llm import get_llm
from cube_to_rag.tools import (
    get_cube_schema_search_tool,
    get_cube_graphql_tools,
    get_vector_search_tool,
    create_milvus_helper
)

# Initialize tools
cube_schema = get_cube_schema_search_tool(k=2)
vector_search = get_vector_search_tool(collection_name="docs", k=3)

# JWT tokens are auto-generated from settings.cube_api_secret
# No need to pass api_token manually - it's handled automatically
graphql_tools = get_cube_graphql_tools(
    graphql_endpoint="http://localhost:4000/cubejs-api/graphql",
    llm=get_llm()
)

tools = [cube_schema, vector_search] + graphql_tools

# Create agent
llm = get_llm()

prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a powerful analytics assistant with access to:

    1. Cube.js analytics (use cube_schema_search FIRST, then graphql)
    2. Document search (use vector_search)

    Always search for relevant schemas/documents before answering."""),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    handle_parsing_errors=True
)

# Use the agent
result = agent_executor.invoke({
    "input": "What metrics are available for student performance?"
})
print(result["output"])

5. Working with Milvus Directly

Advanced Milvus operations:

from cube_to_rag.tools import MilvusHelper

# Create helper (uses credentials from environment by default)
helper = MilvusHelper(collection_name="my_collection")

# Or with custom connection arguments (for authentication)
helper_with_auth = MilvusHelper(
    collection_name="my_collection",
    connection_args={
        "uri": "http://localhost:19530",
        "token": "root:your-password"  # Format: username:password
    }
)

# Search with scores
results = helper.search_with_score("machine learning", k=5)
for doc, score in results:
    print(f"Score: {score:.3f} - {doc.page_content}")

# Search with metadata filtering
filtered_results = helper.search(
    query="neural networks",
    k=3,
    filter_dict={"source": "research_papers"}
)

# Get collection statistics
stats = helper.get_collection_stats()
print(f"Total documents: {stats['num_entities']}")

# Delete documents
helper.delete(["id1", "id2", "id3"])

Configuration

Set environment variables or create a .env file:

# LLM Configuration
LLM_MODEL_ID='anthropic:claude-3-5-sonnet-20241022'
EMBEDDING_MODEL='openai:text-embedding-3-small'

# API Keys
ANTHROPIC_API_KEY='your-key'
OPENAI_API_KEY='your-key'

# Cube.js
CUBE_URL='http://localhost:4000'  # Base URL
# Optional: Cube.js authentication (JWT auto-generated from secret)
# CUBEJS_API_SECRET='your-cubejs-api-secret'

# Milvus
MILVUS_SERVER_URI='http://localhost:19530'
# Optional: Milvus authentication (for secured instances)
# MILVUS_USER='root'
# MILVUS_PASSWORD='your-milvus-password'

Available Tools

Tool	Purpose	Example Use Case
`get_cube_schema_search_tool()`	Search Cube.js schemas	Discover available cubes/dimensions/measures
`get_cube_graphql_tool()`	Execute Cube.js queries	Query analytics data via GraphQL
`get_vector_search_tool()`	Search vector database	Find relevant documents semantically
`create_milvus_helper()`	Manage Milvus operations	Add/search/delete documents in Milvus

API Endpoints

Chat Endpoints

Create Chat Session

POST /chat/new

Response:

{
  "status": "success",
  "session_id": "uuid-here"
}

Ask Question

POST /chat/ask/
Content-Type: application/json

{
  "message": "What's the average GPA by department?"
}

Response: Streaming text response

Get Chat History

GET /chat/history

Clear Chat History

DELETE /chat/clear

Embeddings Endpoints

Ingest Schemas

POST /embeddings/ingest
Content-Type: application/json

{
  "schema_dir": "/path/to/schemas"  # Optional
}

Response:

{
  "success": true,
  "schemas_ingested": 5,
  "message": "Successfully ingested 5 cube schemas"
}

Search Schemas

POST /embeddings/search?query=course%20performance&k=3

Response:

{
  "success": true,
  "query": "course performance",
  "results": [
    {
      "cube_name": "CoursePerformanceSummary",
      "dimensions": [...],
      "measures": [...],
      "relevance_score": 0.95
    }
  ]
}

Health Check

GET /embeddings/health

Usage Examples

Python Client

import requests

# Create session
session = requests.Session()
response = session.post(
    "http://localhost:8080/chat/new",
    headers={"x-access-token": "your-token"}
)

# Ask question
response = session.post(
    "http://localhost:8080/chat/ask/",
    headers={"x-access-token": "your-token"},
    json={"message": "What's the average GPA by department?"},
    stream=True
)

# Stream response
for chunk in response.iter_content(chunk_size=1024):
    if chunk:
        print(chunk.decode('utf-8'), end='', flush=True)

JavaScript/TypeScript Client

// Create session
const response = await fetch('http://localhost:8080/chat/new', {
  method: 'POST',
  headers: {
    'x-access-token': 'your-token'
  }
});

// Ask question with streaming
const askResponse = await fetch('http://localhost:8080/chat/ask/', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-access-token': 'your-token'
  },
  body: JSON.stringify({
    message: 'What\'s the average GPA by department?'
  })
});

// Stream response
const reader = askResponse.body.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  console.log(new TextDecoder().decode(value));
}

cURL Examples

# Create session
curl -X POST http://localhost:8080/chat/new \
  -H "x-access-token: your-token"

# Ask question
curl -X POST http://localhost:8080/chat/ask/ \
  -H "Content-Type: application/json" \
  -H "x-access-token: your-token" \
  -d '{"message": "Show me top 5 courses by enrollment"}' \
  --no-buffer

# Search schemas
curl -X POST "http://localhost:8080/embeddings/search?query=student%20grades&k=2"

# Trigger schema ingestion
curl -X POST http://localhost:8080/embeddings/ingest \
  -H "Content-Type: application/json"

Example Queries

The RAG system can answer questions like:

"What's the average GPA by department?"
"Show me course pass rates"
"Which courses have the highest student engagement?"
"List all courses with their enrollment numbers"
"Compare performance across different semesters"
"What dimensions are available in the data?"
"Show me all measures I can query"

How It Works

1. Schema Ingestion

On startup, the system:

Fetches all cube metadata from Cube.js API (/cubejs-api/v1/meta)
Extracts cube names, dimensions, measures, and descriptions
Generates text embeddings using your configured embedding model
Stores embeddings in Milvus for semantic search

2. Query Processing

When you ask a question:

Schema Search: Performs semantic search in Milvus to find relevant cubes
Query Generation: LLM agent constructs a GraphQL query using discovered schema
Query Execution: Executes query against Cube.js GraphQL API
Response Formatting: Formats results into natural language response

3. Agent Tools

The LLM agent has access to two tools:

cube_schema_search: Semantic search for cubes/dimensions/measures
graphql: Execute GraphQL queries against Cube.js

Supported LLM Providers

OpenAI

LLM_MODEL_ID='openai:gpt-4'
# or
LLM_MODEL_ID='openai:gpt-4o-mini'

EMBEDDING_MODEL='openai:text-embedding-3-small'
OPENAI_API_KEY='sk-your-key'

Anthropic Claude

LLM_MODEL_ID='anthropic:claude-3-5-sonnet-20241022'
EMBEDDING_MODEL='openai:text-embedding-3-small'  # Claude doesn't have embeddings

ANTHROPIC_API_KEY='sk-ant-your-key'
OPENAI_API_KEY='sk-your-key'  # Still needed for embeddings

AWS Bedrock

LLM_MODEL_ID='bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0'
EMBEDDING_MODEL='bedrock:amazon.titan-embed-text-v2:0'

AWS_ACCESS_KEY_ID='your-key'
AWS_SECRET_ACCESS_KEY='your-secret'
AWS_DEFAULT_REGION='us-east-1'

Development

Running Tests

poetry run pytest

Code Formatting

poetry run black .
poetry run ruff check .

Deployment

Docker

FROM python:3.12-slim

WORKDIR /app

# Install dependencies
RUN pip install cube-to-rag

# Copy environment file
COPY .env .env

# Run server
CMD ["uvicorn", "app.server.main:app", "--host", "0.0.0.0", "--port", "8080"]

Environment Variables

For production, set:

DEPLOY_ENV='prod'
SECRET_KEY='strong-random-secret'
FAST_API_ACCESS_SECRET_TOKEN='strong-random-token'

Troubleshooting

Schema search returns no results

# Check embeddings health
curl http://localhost:8080/embeddings/health

# Re-ingest schemas
curl -X POST http://localhost:8080/embeddings/ingest

GraphQL queries fail

Verify Cube.js is running: curl http://localhost:4000/readyz
Check cube names use camelCase in GraphQL (e.g., coursePerformanceSummary not CoursePerformanceSummary)
Review FastAPI logs: docker logs cube-rag-api

LLM errors

Verify API keys are set correctly
Check model IDs match your provider
Ensure sufficient API credits/quota

Project details

These details have not been verified by PyPI

Release history Release notifications | RSS feed

This version

0.1.0a4 pre-release

Jun 21, 2026

0.1.0a3 pre-release

May 5, 2026

0.1.0a2 pre-release

Feb 1, 2026

0.1.0a1 pre-release

Jan 5, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cube_to_rag-0.1.0a4.tar.gz (25.9 kB view details)

Uploaded Jun 21, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

cube_to_rag-0.1.0a4-py3-none-any.whl (28.5 kB view details)

Uploaded Jun 21, 2026 Python 3

File details

Details for the file cube_to_rag-0.1.0a4.tar.gz.

File metadata

Download URL: cube_to_rag-0.1.0a4.tar.gz
Upload date: Jun 21, 2026
Size: 25.9 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.4.1 CPython/3.11.15 Linux/6.17.0-1018-azure

File hashes

Hashes for cube_to_rag-0.1.0a4.tar.gz
Algorithm	Hash digest
SHA256	`016cd74a148cc310abb656189304f64ec3efc784d6e5fc6ff93fc333e8b47233`
MD5	`c8917540fa0c4a9a0354bcfedaf2b65d`
BLAKE2b-256	`8f482324ea57555ed9ba8a57309ee0c41e0573a1b6a1859a494ea7249840c3f3`

See more details on using hashes here.

File details

Details for the file cube_to_rag-0.1.0a4-py3-none-any.whl.

File metadata

Download URL: cube_to_rag-0.1.0a4-py3-none-any.whl
Upload date: Jun 21, 2026
Size: 28.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.4.1 CPython/3.11.15 Linux/6.17.0-1018-azure

File hashes

Hashes for cube_to_rag-0.1.0a4-py3-none-any.whl
Algorithm	Hash digest
SHA256	`5f99ec0834d8e56cdbc96ec1ef32e9ba96cefc5207b0e580f6b6d23b76970a39`
MD5	`e1e104dd200a7e2183d874b513c71235`
BLAKE2b-256	`353198cce424196630bdfcb184f41b7e589989986a5fe1b6321585adc12353b2`

See more details on using hashes here.

cube-to-rag 0.1.0a4

Navigation

Verified details

Maintainers

Unverified details

Meta

Classifiers

Project description

Cube-to-RAG

Features

Architecture

Installation

Using pip

Using Poetry

From source

Configuration

Quick Start

1. Start Required Services

2. Run the API Server

3. Ingest Cube.js Schemas

Using in Your Own Applications

Installation

1. Add Cube.js Extraction Tool to Your Agent

2. Add Documents to Milvus Vector Database

3. Add Vector Search Tool to Your Agent

4. Complete Example: Agent with All Tools

5. Working with Milvus Directly

Configuration

Available Tools

API Endpoints

Chat Endpoints

Create Chat Session

Ask Question

Get Chat History

Clear Chat History

Embeddings Endpoints

Ingest Schemas

Search Schemas

Health Check

Usage Examples

Python Client

JavaScript/TypeScript Client

cURL Examples

Example Queries

How It Works

1. Schema Ingestion

2. Query Processing

3. Agent Tools

Supported LLM Providers

OpenAI

Anthropic Claude

AWS Bedrock

Development

Running Tests

Code Formatting

Deployment

Docker

Environment Variables

Troubleshooting

Schema search returns no results

GraphQL queries fail

LLM errors

Project details

Verified details

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes