Skip to main content

VTK Python code generation with prompt clarification, task decomposition, and sequential generation (MCP + optional RAG)

Project description

VTK Sequential Thinking

RAG-based VTK Python code generation with prompt clarification, task decomposition, and sequential code generation.

Overview

This project turns a user prompt into runnable VTK Python code via three stages:

  • Prompt clarification (ClarificationSession)
  • Task decomposition (DecompositionSession)
  • Code generation (GenerationSession)

Quick Start

Prerequisites

  • Python 3.10+
  • uv - Fast Python package manager
# Install uv if not already installed
curl -LsSf https://astral.sh/uv/install.sh | sh

1. Run Setup

./setup.sh

This creates a .venv virtual environment using uv and installs dependencies interactively.

Or install manually:

# Create virtual environment
uv venv .venv
source .venv/bin/activate

# Install package with dev dependencies
uv pip install -e ".[dev]"

# Optional extras
uv pip install -e ".[llm]"   # LLM providers
uv pip install -e ".[mcp]"   # VTK API tooling
uv pip install -e ".[rag]"   # RAG (requires Qdrant)
uv pip install -e ".[vtk]"   # VTK runtime

# All extras
uv pip install -e ".[dev,llm,mcp,rag,vtk]"

2. Configure Environment

cp .env.example .env
# Edit .env with your LLM API key

3. Start Qdrant

docker run -d -p 6333:6333 qdrant/qdrant

4. Index Your Data

You'll need to index your VTK documentation. The data files are:

  • data/vtk-python-docs.jsonl (61 MB) - API documentation
  • data/raw/vtk-python-examples.jsonl (5.4 MB) - Code examples
  • data/raw/vtk-python-tests.jsonl (4.8 MB) - Test cases

Note: Indexing tools are in the parent vtk-rag repository. You need to build the Qdrant index before querying.

5. Use the CLI

source .venv/bin/activate
vtk-st --help

# Evaluate prompt clarity
vtk-st evaluate "Read a VTK file and visualize it"

# Clarify a prompt (interactive by default)
vtk-st query "Read a VTK file and visualize it"

# Decompose into tasks
vtk-st decompose "Read volume.vti and create an isosurface at value 135"

# Full pipeline
vtk-st pipeline "Read volume.vti and create an isosurface at value 135"

Repository Structure

vtk-sequential-thinking/
├── pyproject.toml
├── README.md
├── setup.sh
├── examples/
├── tests/
└── vtk_sequential_thinking/

Architecture

High-level pipeline

User Prompt
  -> ClarificationSession (optional, interactive)
  -> DecompositionSession (LLM + MCP tooling)
  -> GenerationSession (LLM + MCP + RAG)
  -> Python code output

Project structure (current)

vtk_sequential_thinking/
├── __init__.py
├── cli.py
├── config.py
├── llm/
│   ├── __init__.py
│   ├── client.py
│   └── json_protocol.py
├── mcp/
│   ├── __init__.py
│   ├── client.py
│   └── persistent_client.py
├── prompt_clarification/
│   ├── __init__.py
│   ├── models.py
│   ├── prompts.py
│   ├── clarifier.py
│   └── session.py
├── task_decomposition/
│   ├── __init__.py
│   ├── models.py
│   ├── prompts.py
│   ├── decomposer.py
│   └── session.py
├── sequential_generation/
│   ├── __init__.py
│   ├── models.py
│   ├── prompts.py
│   ├── generator.py
│   ├── code_assembler.py
│   └── session.py
└── rag/
    ├── __init__.py
    ├── client.py
    ├── models.py
    └── ranking.py

Public API (library)

The library exports three “session” entry points:

  • ClarificationSession (prompt -> synthesized prompt)
  • DecompositionSession (prompt -> tasks)
  • GenerationSession (tasks -> code)

They are exported from vtk_sequential_thinking/__init__.py as aliases of the internal Session classes in each subpackage.


Stage 1: Prompt clarification

Key data:

  • SessionResponse.status: one of clear, needs_clarification, ready_to_synthesize, synthesized, restart, skipped
  • SessionResponse.prompt: the original prompt
  • SessionResponse.questions: pending questions (if any)
  • SessionResponse.synthesized_prompt: only set after synthesis

Key files:

  • vtk_sequential_thinking/prompt_clarification/models.py
  • vtk_sequential_thinking/prompt_clarification/clarifier.py
  • vtk_sequential_thinking/prompt_clarification/session.py

Stage 2: Task decomposition

Key data:

  • Task: {id, task_type, description, search_query, depends_on, vtk_classes, from_prompt}
  • DecompositionResult: {tasks, output_type, reasoning}

The decomposition session supports:

  • decompose(prompt)
  • refine(modifications, additions)
  • finalize()

Key files:

  • vtk_sequential_thinking/task_decomposition/models.py
  • vtk_sequential_thinking/task_decomposition/prompts.py
  • vtk_sequential_thinking/task_decomposition/decomposer.py
  • vtk_sequential_thinking/task_decomposition/session.py

Stage 3: Sequential code generation

Flow:

tasks[]
  -> Generator.generate(task)
     - (optional) retrieve examples via RAG
     - tool loop via MCP (VTK API grounding)
     - JSONProtocol decoding into TaskResult
  -> CodeAssembler.add_snippet(...)
  -> CodeAssembler.assemble() -> final_code

Output:

  • PipelineResult.code: final assembled code
  • PipelineResult.task_results: per-task outputs

Key files:

  • vtk_sequential_thinking/sequential_generation/session.py
  • vtk_sequential_thinking/sequential_generation/generator.py
  • vtk_sequential_thinking/sequential_generation/code_assembler.py
  • vtk_sequential_thinking/sequential_generation/models.py

CLI mapping

The CLI is implemented in vtk_sequential_thinking/cli.py using Typer.

  • vtk-st evaluate: clarity evaluation only
  • vtk-st query: interactive clarification (outputs synthesized prompt)
  • vtk-st decompose: prompt -> tasks JSON
  • vtk-st generate: tasks JSON -> code
  • vtk-st pipeline: clarify -> decompose -> generate

Tests

Tests are split into offline-safe unit tests and CLI-level integration tests:

  • tests/unit/
  • tests/integration/

Many integration tests monkeypatch external clients so they can run without live services.


Examples

  • examples/clarification_example.py: clarification only
  • examples/decomposition_example.py: decomposition/refinement only
  • examples/generation_example.py: generation only
  • examples/pipeline_example.py: full pipeline demonstration

LLM providers

The LLM client supports multiple providers:

  • OpenAI
  • Anthropic
  • Google

Configuration

Environment Variables (.env)

# LLM Provider (choose one)
LLM_PROVIDER=anthropic          # anthropic, openai, google, local

# API Keys
ANTHROPIC_API_KEY=...
OPENAI_API_KEY=...
GOOGLE_API_KEY=...

# Model Selection
ANTHROPIC_MODEL=...
OPENAI_MODEL=...
GOOGLE_MODEL=...

# VTK API docs (used by vtkapi-mcp tooling)
VTK_API_DOCS_PATH=data/vtk-python-docs.jsonl

# Qdrant (RAG)
QDRANT_URL=http://localhost:6333
QDRANT_CODE_COLLECTION=vtk_code

Usage Examples

Programmatic usage

from vtk_sequential_thinking import (
    ClarificationSession,
    DecompositionSession,
    GenerationSession,
    LLMClient,
    MCPClient,
    load_config,
)

config = load_config()
llm_client = LLMClient(app_config=config)
mcp_client = MCPClient(app_config=config)

# 1) Clarify
clarify = ClarificationSession.from_config(config, llm_client=llm_client)
resp = clarify.submit_prompt("Read a VTK file and visualize it")
if resp.status != "clear":
    # In a real app, you'd iterate questions and then call synthesize()
    resp = clarify.synthesize()
synthesized_prompt = resp.prompt if resp.status == "clear" else (resp.synthesized_prompt or "")

# 2) Decompose
decomposer = DecompositionSession.from_config(config, llm_client=llm_client, mcp_client=mcp_client)
decomp = decomposer.decompose(synthesized_prompt)

# 3) Generate
generator = GenerationSession.from_config(config, llm_client=llm_client, mcp_client=mcp_client)
result = generator.generate(tasks=decomp.tasks, original_prompt=synthesized_prompt)
print(result.code)

Development

Tests

uv run pytest tests

Lint

uv run ruff check vtk_sequential_thinking/ tests/

Coverage (terminal)

uv run pytest tests --cov=vtk_sequential_thinking --cov-report=term-missing

Core Dependencies

  • pydantic - Data validation
  • python-dotenv - Environment configuration
  • typer / rich - CLI
  • anthropic / openai / google-generativeai - LLM providers
  • mcp - MCP client for VTK API validation
  • vtkapi-mcp - VTK API MCP server

Not Included

  • Indexing Tools - Use parent vtk-rag repository

Notes

  • RAG requires Qdrant: vtk_sequential_thinking.rag.client expects a live Qdrant server.
  • VTK API tooling: configure VTK_API_DOCS_PATH for MCP-based API grounding/validation.

Related Projects


License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vtk_sequential_thinking-0.1.0.tar.gz (279.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vtk_sequential_thinking-0.1.0-py3-none-any.whl (65.3 kB view details)

Uploaded Python 3

File details

Details for the file vtk_sequential_thinking-0.1.0.tar.gz.

File metadata

  • Download URL: vtk_sequential_thinking-0.1.0.tar.gz
  • Upload date:
  • Size: 279.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.16 {"installer":{"name":"uv","version":"0.9.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vtk_sequential_thinking-0.1.0.tar.gz
Algorithm Hash digest
SHA256 c0dc5da36e92394e9b75b4c8a986a67a8fe98d69be01b4723e20ac27b95967e3
MD5 199d037c431c2a2603fbac5047e41634
BLAKE2b-256 858430491804d508d58f05f1cdd1f9555ad69a5659573c730334bb29ff0d8a95

See more details on using hashes here.

File details

Details for the file vtk_sequential_thinking-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: vtk_sequential_thinking-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 65.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.16 {"installer":{"name":"uv","version":"0.9.16","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"macOS","version":null,"id":null,"libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":null}

File hashes

Hashes for vtk_sequential_thinking-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 7687090f7ca4106a048f110bddb302a80b5ba31d145239777f560c49e31c81d1
MD5 3f2938d6b3b0780d11a0cfd3264a00bc
BLAKE2b-256 9f8e42328dde9ffbcdd7a0665f4e470f633dcc895dfa5b09711a69b98d5a9805

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page