Skip to main content

RAG system for Cube.js analytics with vector search capabilities

Project description

Cube-to-RAG

Natural language interface for Cube.js analytics using RAG (Retrieval-Augmented Generation) with vector search and LLM agents.

Features

  • Semantic Schema Search: Discover cubes, dimensions, and measures using natural language
  • GraphQL Query Generation: Automatically build and execute Cube.js GraphQL queries
  • Vector Database: Store and search schema embeddings in Milvus
  • Multiple LLM Providers: Support for OpenAI, Anthropic Claude, and AWS Bedrock
  • Streaming Responses: Real-time token-by-token streaming for better UX
  • FastAPI Backend: Modern async Python web framework
  • Jinja2 Templates: Flexible schema formatting

Architecture

User Question
     ↓
1. Semantic Search (Milvus)
   └→ Find relevant cubes, dimensions, measures
     ↓
2. GraphQL Query Construction
   └→ Build query using discovered schema
     ↓
3. Execute Query (Cube.js GraphQL API)
   └→ http://cube_api:4000/cubejs-api/graphql
     ↓
4. Format & Explain Results
   └→ Natural language response

Installation

Using pip

pip install cube-to-rag

Using Poetry

poetry add cube-to-rag

From source

git clone https://github.com/yourusername/dbt-to-cube.git
cd cube-to-rag
poetry install

Configuration

Create a .env file with your configuration:

# LLM Configuration
LLM_MODEL_ID='anthropic:claude-3-5-sonnet-20241022'  # or 'openai:gpt-4' or 'bedrock:...'
EMBEDDING_MODEL='openai:text-embedding-3-small'  # or 'bedrock:amazon.titan-embed-text-v2:0'

# API Keys (choose based on your LLM/embedding provider)
ANTHROPIC_API_KEY='sk-ant-your-key-here'
OPENAI_API_KEY='sk-your-key-here'

# AWS Bedrock (if using Bedrock models)
AWS_ACCESS_KEY_ID='your-aws-key'
AWS_SECRET_ACCESS_KEY='your-aws-secret'
AWS_DEFAULT_REGION='us-east-1'

# Cube.js Configuration
CUBE_URL='http://cube_api:4000'  # Base URL (GraphQL and REST APIs are auto-constructed)
# Optional: Cube.js API secret (JWT tokens are auto-generated)
# CUBEJS_API_SECRET='your-cubejs-api-secret'

# Milvus Vector Database
MILVUS_SERVER_URI='http://localhost:19530'
# Optional: Milvus authentication (for secured instances)
# MILVUS_USER='root'
# MILVUS_PASSWORD='your-milvus-password'

# Security
SECRET_KEY='your-secret-key-here'
FAST_API_ACCESS_SECRET_TOKEN='your-access-token'
DEPLOY_ENV='local'  # or 'prod'

Quick Start

1. Start Required Services

You'll need:

  • Cube.js running with GraphQL API enabled
  • Milvus vector database

Using Docker Compose:

docker-compose -f docker-compose.yml -f docker-compose.milvus.yml -f docker-compose.ai.yml up -d

2. Run the API Server

# Using uvicorn directly
uvicorn app.server.main:app --host 0.0.0.0 --port 8080

# Or using the installed package
python -m cube_to_rag

3. Ingest Cube.js Schemas

The API will automatically fetch and ingest schemas from Cube.js on startup. You can also trigger manual ingestion:

curl -X POST http://localhost:8080/embeddings/ingest \
  -H "Content-Type: application/json"

Using in Your Own Applications

The cube-to-rag package provides tools and utilities you can add to your existing LangChain agents and applications.

Installation

pip install cube-to-rag
# or
poetry add cube-to-rag

1. Add Cube.js Extraction Tool to Your Agent

Add Cube.js querying capabilities to any existing LangChain agent:

from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate
from langchain_community.agent_toolkits.load_tools import load_tools

from cube_to_rag.core.llm import get_llm
from cube_to_rag.tools import get_cube_schema_search_tool

# Your existing tools
your_existing_tools = [...]

# Add Cube.js extraction tools
cube_schema_search = get_cube_schema_search_tool(k=3)

# Use LangChain's built-in GraphQL tool for queries
graphql_tools = load_tools(
    ["graphql"],
    graphql_endpoint="http://localhost:4000/cubejs-api/graphql",
    llm=get_llm()
)

# Combine all tools
tools = your_existing_tools + [cube_schema_search] + graphql_tools

# Create agent with all tools
llm = get_llm()
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with access to analytics data."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

# Now your agent can answer questions like:
result = agent_executor.invoke({
    "input": "What's the average course GPA by department?"
})

2. Add Documents to Milvus Vector Database

Easily embed and store your documents in Milvus:

from langchain_core.documents import Document
from cube_to_rag.tools import create_milvus_helper

# Create helper for your collection
helper = create_milvus_helper(collection_name="my_documents")

# Add documents
docs = [
    Document(
        page_content="Your document content here",
        metadata={"source": "doc1.pdf", "page": 1}
    ),
    Document(
        page_content="More content",
        metadata={"source": "doc2.pdf", "page": 1}
    )
]

ids = helper.add_documents(docs)
print(f"Added {len(ids)} documents to Milvus")

# Or add text directly
texts = ["First document", "Second document"]
metadatas = [{"source": "text1"}, {"source": "text2"}]

ids = helper.add_texts(texts, metadatas)

3. Add Vector Search Tool to Your Agent

Add semantic document search to any LangChain agent:

from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate

from cube_to_rag.core.llm import get_llm
from cube_to_rag.tools import get_vector_search_tool

# Your existing tools
your_existing_tools = [...]

# Add vector search tool
vector_search = get_vector_search_tool(
    collection_name="my_documents",
    k=5
)

# Combine tools
tools = your_existing_tools + [vector_search]

# Create agent
llm = get_llm()
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant with access to document search."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

# Now your agent can search documents
result = agent_executor.invoke({
    "input": "Find documents about machine learning"
})

4. Complete Example: Agent with All Tools

from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.agent_toolkits.load_tools import load_tools

from cube_to_rag.core.llm import get_llm
from cube_to_rag.tools import (
    get_cube_schema_search_tool,
    get_cube_graphql_tools,
    get_vector_search_tool,
    create_milvus_helper
)

# Initialize tools
cube_schema = get_cube_schema_search_tool(k=2)
vector_search = get_vector_search_tool(collection_name="docs", k=3)

# JWT tokens are auto-generated from settings.cube_api_secret
# No need to pass api_token manually - it's handled automatically
graphql_tools = get_cube_graphql_tools(
    graphql_endpoint="http://localhost:4000/cubejs-api/graphql",
    llm=get_llm()
)

tools = [cube_schema, vector_search] + graphql_tools

# Create agent
llm = get_llm()

prompt = ChatPromptTemplate.from_messages([
    ("system", """You are a powerful analytics assistant with access to:

    1. Cube.js analytics (use cube_schema_search FIRST, then graphql)
    2. Document search (use vector_search)

    Always search for relevant schemas/documents before answering."""),
    ("human", "{input}"),
    MessagesPlaceholder(variable_name="agent_scratchpad"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
    handle_parsing_errors=True
)

# Use the agent
result = agent_executor.invoke({
    "input": "What metrics are available for student performance?"
})
print(result["output"])

5. Working with Milvus Directly

Advanced Milvus operations:

from cube_to_rag.tools import MilvusHelper

# Create helper (uses credentials from environment by default)
helper = MilvusHelper(collection_name="my_collection")

# Or with custom connection arguments (for authentication)
helper_with_auth = MilvusHelper(
    collection_name="my_collection",
    connection_args={
        "uri": "http://localhost:19530",
        "token": "root:your-password"  # Format: username:password
    }
)

# Search with scores
results = helper.search_with_score("machine learning", k=5)
for doc, score in results:
    print(f"Score: {score:.3f} - {doc.page_content}")

# Search with metadata filtering
filtered_results = helper.search(
    query="neural networks",
    k=3,
    filter_dict={"source": "research_papers"}
)

# Get collection statistics
stats = helper.get_collection_stats()
print(f"Total documents: {stats['num_entities']}")

# Delete documents
helper.delete(["id1", "id2", "id3"])

Configuration

Set environment variables or create a .env file:

# LLM Configuration
LLM_MODEL_ID='anthropic:claude-3-5-sonnet-20241022'
EMBEDDING_MODEL='openai:text-embedding-3-small'

# API Keys
ANTHROPIC_API_KEY='your-key'
OPENAI_API_KEY='your-key'

# Cube.js
CUBE_URL='http://localhost:4000'  # Base URL
# Optional: Cube.js authentication (JWT auto-generated from secret)
# CUBEJS_API_SECRET='your-cubejs-api-secret'

# Milvus
MILVUS_SERVER_URI='http://localhost:19530'
# Optional: Milvus authentication (for secured instances)
# MILVUS_USER='root'
# MILVUS_PASSWORD='your-milvus-password'

Available Tools

Tool Purpose Example Use Case
get_cube_schema_search_tool() Search Cube.js schemas Discover available cubes/dimensions/measures
get_cube_graphql_tool() Execute Cube.js queries Query analytics data via GraphQL
get_vector_search_tool() Search vector database Find relevant documents semantically
create_milvus_helper() Manage Milvus operations Add/search/delete documents in Milvus

API Endpoints

Chat Endpoints

Create Chat Session

POST /chat/new

Response:

{
  "status": "success",
  "session_id": "uuid-here"
}

Ask Question

POST /chat/ask/
Content-Type: application/json

{
  "message": "What's the average GPA by department?"
}

Response: Streaming text response

Get Chat History

GET /chat/history

Clear Chat History

DELETE /chat/clear

Embeddings Endpoints

Ingest Schemas

POST /embeddings/ingest
Content-Type: application/json

{
  "schema_dir": "/path/to/schemas"  # Optional
}

Response:

{
  "success": true,
  "schemas_ingested": 5,
  "message": "Successfully ingested 5 cube schemas"
}

Search Schemas

POST /embeddings/search?query=course%20performance&k=3

Response:

{
  "success": true,
  "query": "course performance",
  "results": [
    {
      "cube_name": "CoursePerformanceSummary",
      "dimensions": [...],
      "measures": [...],
      "relevance_score": 0.95
    }
  ]
}

Health Check

GET /embeddings/health

Usage Examples

Python Client

import requests

# Create session
session = requests.Session()
response = session.post(
    "http://localhost:8080/chat/new",
    headers={"x-access-token": "your-token"}
)

# Ask question
response = session.post(
    "http://localhost:8080/chat/ask/",
    headers={"x-access-token": "your-token"},
    json={"message": "What's the average GPA by department?"},
    stream=True
)

# Stream response
for chunk in response.iter_content(chunk_size=1024):
    if chunk:
        print(chunk.decode('utf-8'), end='', flush=True)

JavaScript/TypeScript Client

// Create session
const response = await fetch('http://localhost:8080/chat/new', {
  method: 'POST',
  headers: {
    'x-access-token': 'your-token'
  }
});

// Ask question with streaming
const askResponse = await fetch('http://localhost:8080/chat/ask/', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'x-access-token': 'your-token'
  },
  body: JSON.stringify({
    message: 'What\'s the average GPA by department?'
  })
});

// Stream response
const reader = askResponse.body.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  console.log(new TextDecoder().decode(value));
}

cURL Examples

# Create session
curl -X POST http://localhost:8080/chat/new \
  -H "x-access-token: your-token"

# Ask question
curl -X POST http://localhost:8080/chat/ask/ \
  -H "Content-Type: application/json" \
  -H "x-access-token: your-token" \
  -d '{"message": "Show me top 5 courses by enrollment"}' \
  --no-buffer

# Search schemas
curl -X POST "http://localhost:8080/embeddings/search?query=student%20grades&k=2"

# Trigger schema ingestion
curl -X POST http://localhost:8080/embeddings/ingest \
  -H "Content-Type: application/json"

Example Queries

The RAG system can answer questions like:

  • "What's the average GPA by department?"
  • "Show me course pass rates"
  • "Which courses have the highest student engagement?"
  • "List all courses with their enrollment numbers"
  • "Compare performance across different semesters"
  • "What dimensions are available in the data?"
  • "Show me all measures I can query"

How It Works

1. Schema Ingestion

On startup, the system:

  1. Fetches all cube metadata from Cube.js API (/cubejs-api/v1/meta)
  2. Extracts cube names, dimensions, measures, and descriptions
  3. Generates text embeddings using your configured embedding model
  4. Stores embeddings in Milvus for semantic search

2. Query Processing

When you ask a question:

  1. Schema Search: Performs semantic search in Milvus to find relevant cubes
  2. Query Generation: LLM agent constructs a GraphQL query using discovered schema
  3. Query Execution: Executes query against Cube.js GraphQL API
  4. Response Formatting: Formats results into natural language response

3. Agent Tools

The LLM agent has access to two tools:

  • cube_schema_search: Semantic search for cubes/dimensions/measures
  • graphql: Execute GraphQL queries against Cube.js

Supported LLM Providers

OpenAI

LLM_MODEL_ID='openai:gpt-4'
# or
LLM_MODEL_ID='openai:gpt-4o-mini'

EMBEDDING_MODEL='openai:text-embedding-3-small'
OPENAI_API_KEY='sk-your-key'

Anthropic Claude

LLM_MODEL_ID='anthropic:claude-3-5-sonnet-20241022'
EMBEDDING_MODEL='openai:text-embedding-3-small'  # Claude doesn't have embeddings

ANTHROPIC_API_KEY='sk-ant-your-key'
OPENAI_API_KEY='sk-your-key'  # Still needed for embeddings

AWS Bedrock

LLM_MODEL_ID='bedrock:anthropic.claude-3-5-sonnet-20240620-v1:0'
EMBEDDING_MODEL='bedrock:amazon.titan-embed-text-v2:0'

AWS_ACCESS_KEY_ID='your-key'
AWS_SECRET_ACCESS_KEY='your-secret'
AWS_DEFAULT_REGION='us-east-1'

Development

Running Tests

poetry run pytest

Code Formatting

poetry run black .
poetry run ruff check .

Deployment

Docker

FROM python:3.12-slim

WORKDIR /app

# Install dependencies
RUN pip install cube-to-rag

# Copy environment file
COPY .env .env

# Run server
CMD ["uvicorn", "app.server.main:app", "--host", "0.0.0.0", "--port", "8080"]

Environment Variables

For production, set:

DEPLOY_ENV='prod'
SECRET_KEY='strong-random-secret'
FAST_API_ACCESS_SECRET_TOKEN='strong-random-token'

Troubleshooting

Schema search returns no results

# Check embeddings health
curl http://localhost:8080/embeddings/health

# Re-ingest schemas
curl -X POST http://localhost:8080/embeddings/ingest

GraphQL queries fail

  • Verify Cube.js is running: curl http://localhost:4000/readyz
  • Check cube names use camelCase in GraphQL (e.g., coursePerformanceSummary not CoursePerformanceSummary)
  • Review FastAPI logs: docker logs cube-rag-api

LLM errors

  • Verify API keys are set correctly
  • Check model IDs match your provider
  • Ensure sufficient API credits/quota

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

cube_to_rag-0.1.0a2.tar.gz (24.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

cube_to_rag-0.1.0a2-py3-none-any.whl (27.2 kB view details)

Uploaded Python 3

File details

Details for the file cube_to_rag-0.1.0a2.tar.gz.

File metadata

  • Download URL: cube_to_rag-0.1.0a2.tar.gz
  • Upload date:
  • Size: 24.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.11.14 Linux/6.11.0-1018-azure

File hashes

Hashes for cube_to_rag-0.1.0a2.tar.gz
Algorithm Hash digest
SHA256 3b57f61e9e9cd00e52cf83b4b26c38adf594ee696437a3a6d97a24e8004c7e8d
MD5 24124156de84eebe12aa4f6d45f93ffd
BLAKE2b-256 b2c160c456fdcd6c5921770b104f925bb8c4b48c562b3088bab60f5f1897b951

See more details on using hashes here.

File details

Details for the file cube_to_rag-0.1.0a2-py3-none-any.whl.

File metadata

  • Download URL: cube_to_rag-0.1.0a2-py3-none-any.whl
  • Upload date:
  • Size: 27.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.3.2 CPython/3.11.14 Linux/6.11.0-1018-azure

File hashes

Hashes for cube_to_rag-0.1.0a2-py3-none-any.whl
Algorithm Hash digest
SHA256 a4acc74856b6e87c6020bb19960f3fa152d24664ddcc904055ab9bb41ce24836
MD5 cafe4cc8781e8f499c23365adec6d59a
BLAKE2b-256 8a495a5fb3ea1e78c925dbd1867648293fc90f0d7cd883056cebc925b1cf5a45

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page