Skip to main content

AI-native creation organism. Create, critique, and evolve cultural art.

Project description

VULCA

PyPI version Python 3.10+ License: Apache 2.0 Tests

Multi-dimensional creative evaluation with cultural awareness. L1-L5 scoring across 13+ domains, self-evolving weights, pluggable model backends.

VULCA evaluates visual works — from traditional ink wash paintings to UI mockups — using a five-layer framework (L1-L5) that goes beyond surface aesthetics into cultural context, critical interpretation, and philosophical depth. Each domain has its own weights, terminology, and taboos. The system learns from every evaluation session.

Based on peer-reviewed research: VULCA Framework (EMNLP 2025 Findings) and VULCA-Bench (7,410 samples across 9 traditions).

Install

pip install vulca

Quick Start

import vulca

# Evaluate traditional art
result = vulca.evaluate("painting.jpg", tradition="chinese_xieyi")
print(result.score)          # 0.82
print(result.dimensions)     # {"L1": 0.75, "L2": 0.82, "L3": 0.86, "L4": 0.79, "L5": 0.84}
print(result.rationales)     # {"L1": "Masterful use of blank space (留白)...", ...}

# Evaluate a UI design
result = vulca.evaluate("mockup.png", tradition="ui_ux_design")

# Evaluate a photograph
result = vulca.evaluate("photo.jpg", tradition="photography")

# Create through the full pipeline (Scout → Draft → Critic → Decide)
result = vulca.create("Misty mountains in ink wash", tradition="chinese_xieyi")
print(result.status)         # "completed"
print(result.weighted_total) # 0.82

# Human-in-the-loop with custom dimension weights
result = vulca.create(
    "水墨山水",
    provider="mock",
    hitl=True,
    weights={"L1": 0.3, "L2": 0.2, "L3": 0.2, "L4": 0.15, "L5": 0.15},
)
# result.status == "waiting_human" — pipeline pauses for human review

# Mock mode — no API key needed, great for testing
result = vulca.evaluate("painting.jpg", mock=True)

CLI

# List all available domains (traditions + design + contemporary)
vulca traditions

# Get cultural guide for a domain (terminology, taboos, weights)
vulca tradition chinese_xieyi
vulca tradition photography
vulca tradition brand_design

# Evaluate artwork with L1-L5 scores
vulca evaluate painting.jpg --tradition chinese_xieyi
vulca evaluate mockup.png --tradition ui_ux_design
vulca evaluate photo.jpg --tradition photography

# Create through the pipeline
vulca create "Garden scene" --provider mock

# Check evolution status (how weights have shifted)
vulca evolution chinese_xieyi

# HITL mode with custom weights
vulca create "水墨山水" --hitl --weights "L1=0.3,L2=0.2,L3=0.2,L4=0.15,L5=0.15"

# Use different image/VLM providers
vulca create "landscape" --image-provider openai
vulca evaluate photo.jpg --vlm-model ollama/llava

# JSON output for scripting
vulca evaluate painting.jpg --mock --json

AI Agent Integration (MCP)

VULCA ships a built-in MCP server for AI coding assistants. 6 tools with view (summary/detailed) and format (json/markdown) parameters.

vulca-mcp  # Starts MCP server (stdio transport)

Claude Code

claude mcp add vulca -- vulca-mcp

OpenAI Codex CLI

codex mcp add vulca -- vulca-mcp

OpenHands

[mcp]
stdio_servers = [{ name = "vulca", command = "vulca-mcp", args = [] }]

MCP Tools

Tool Description
create_artwork Create artwork through the pipeline, returns image + L1-L5 scores + rationale
evaluate_artwork Evaluate artwork on L1-L5 dimensions with rationale text
list_traditions List all domains with emphasis and description
get_tradition_guide Full cultural context: terminology, taboos, weights, evolution
resume_artwork HITL continuation: accept, refine (with feedback), or reject
get_evolution_status Weight evolution history and insights per domain

L1-L5 Evaluation Framework

Five layers of evaluation, each reinterpreted per domain:

Layer General Traditional Art UI/UX Design Photography
L1 Surface Perception Composition, ink harmony Visual discoverability Composition, light
L2 Technical Execution Brushwork, medium mastery Pattern compliance, WCAG Exposure, focus, processing
L3 Contextual Fit Cultural tradition adherence User mental model match Genre conventions
L4 Critical Reading Narrative depth, symbolism Intent-behavior alignment Narrative power
L5 Deeper Meaning Philosophical aesthetics Design ethics Artistic vision

The same scoring engine handles all domains. L3-L5 weights shift dramatically — Chinese Xieyi weights L5 (philosophy) at 30%, while Brand Design weights L2 (system adherence) at 30%.

Domains

Cultural Traditions

Tradition Emphasis L1 L2 L3 L4 L5
Chinese Xieyi (写意) Philosophical 0.10 0.15 0.25 0.20 0.30
Chinese Gongbi (工笔) Technical 0.15 0.30 0.25 0.15 0.15
Japanese Traditional Philosophical 0.15 0.20 0.20 0.20 0.25
Western Academic Technical 0.20 0.25 0.15 0.25 0.15
Islamic Geometric Technical 0.25 0.30 0.20 0.15 0.10
Watercolor Cultural 0.20 0.25 0.15 0.20 0.20
African Traditional Cultural 0.15 0.20 0.30 0.20 0.15
South Asian Cultural 0.15 0.20 0.25 0.15 0.25

Design & Contemporary

Domain Emphasis L1 L2 L3 L4 L5
Contemporary Art Art-Historical 0.10 0.15 0.30 0.25 0.20
Photography Balanced 0.25 0.25 0.20 0.20 0.10
Brand Design Technical 0.25 0.30 0.25 0.15 0.05
UI/UX Design Technical 0.20 0.30 0.25 0.20 0.05

All weights evolve automatically through the self-evolution system. Run vulca tradition <name> to see domain-specific terminology and taboos.

Pluggable Providers

Bring your own models for image generation and VLM scoring:

from vulca.providers import ImageProvider, ImageResult

class MyLocalSD(ImageProvider):
    async def generate(self, prompt, **kwargs):
        # Call your model
        return ImageResult(image_b64="...", mime="image/png")

result = vulca.create("landscape", image_provider=MyLocalSD())

Built-in providers: mock, gemini (Google), openai (DALL-E 3), comfyui (local).

BYOK Quick Start

Gemini (Google AI):

export GOOGLE_API_KEY=your-key-here
vulca create "Misty mountains" --provider gemini --tradition chinese_xieyi
vulca evaluate painting.jpg --tradition chinese_xieyi  # VLM scoring uses same key

OpenAI (DALL-E 3):

export OPENAI_API_KEY=your-key-here
vulca create "Zen garden" --provider openai --tradition japanese_traditional

ComfyUI (local Stable Diffusion):

# Start ComfyUI server first, then:
vulca create "Oil painting" --provider comfyui --image-base-url http://localhost:8188

Local VLM (Ollama, vLLM, etc.):

vulca evaluate photo.jpg --vlm-model ollama/llava --vlm-base-url http://localhost:11434

No API key? Use mock mode for testing:

vulca create "水墨山水" --provider mock
vulca evaluate painting.jpg --mock

Self-Evolution

The system learns from every evaluation session:

  1. ContextEvolver adjusts L1-L5 weights per domain based on scoring patterns
  2. FewShotUpdater selects high-scoring examples (>=0.75) as calibration benchmarks
  3. VLM prompt receives evolved weights, few-shot references, and domain insights
  4. vulca.get_weights(domain) returns evolved weights, falling back to YAML defaults

Contributing a Domain

Add a YAML file to src/vulca/cultural/data/traditions/:

name: my_domain
display_name:
  en: "My Domain"
  zh: ""
weights:
  L1: 0.20
  L2: 0.20
  L3: 0.20
  L4: 0.20
  L5: 0.20
terminology:
  - term: "key concept"
    definition: "What this means in this domain"
taboos:
  - rule: "Do not judge X as Y in this domain"
    severity: medium

See _template.yaml for the full schema. PRs welcome.

Architecture

vulca/
├── pipeline/     # Execution engine + 3 built-in nodes + 3 templates
├── providers/    # Pluggable ImageProvider + VLMProvider protocols
├── cultural/     # 13 YAML domain configs with L1-L5 weights
├── scoring/      # VLM model routing + L1-L5 types
├── storage/      # Session/Feedback protocol + JSONL backend
├── intent/       # Natural language → domain resolution
├── _vlm.py       # Gemini Vision L1-L5 scoring (core)
├── cli.py        # CLI entry point
└── mcp_server.py # MCP server (6 tools, FastMCP)

Canvas UI

The full web experience is live at vulcaart.art:

  • Real-time pipeline visualization with Intelligence Log
  • HITL decision panel with L1-L5 rationale display
  • Weight sliders for custom evaluation
  • Reference image upload for guided creation
  • Gallery with domain filtering

Tests

pip install vulca[dev]
pytest tests/ -v  # 276 tests

Citation

@inproceedings{yu2025vulca,
  title={VULCA: A Framework for Cultural Art Evaluation},
  author={Yu, Haorui},
  booktitle={Findings of EMNLP 2025},
  year={2025}
}

License

Apache 2.0

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vulca-0.3.0.tar.gz (91.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

vulca-0.3.0-py3-none-any.whl (101.4 kB view details)

Uploaded Python 3

File details

Details for the file vulca-0.3.0.tar.gz.

File metadata

  • Download URL: vulca-0.3.0.tar.gz
  • Upload date:
  • Size: 91.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for vulca-0.3.0.tar.gz
Algorithm Hash digest
SHA256 011871e3d3cd0e28c785cb2c19c019b287d41f7f4570248d335e4accb2a065ef
MD5 cda6c8fa939527ae63b60c6875dbc07c
BLAKE2b-256 11c5bcd10228875e0c60ab1a85ceba6582a09f832f51dee8f5facf4e574831f5

See more details on using hashes here.

File details

Details for the file vulca-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: vulca-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 101.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for vulca-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4dd9f8a47d59dc8407a4c529b183efdfedff7cf982a10798c8c36c032c985d05
MD5 7f3c08ac8342eb7d7d18fc4856eecdba
BLAKE2b-256 25b75d1f95c7fdbea56465cf8895b3dd1c312553ddb81633016d82ee0adbe4c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page