VSAR: VSA-grounded reasoning with approximate joins
Project description
VSAR: VSA-grounded Reasoning
VSAR (VSAX Reasoner) is a VSA-grounded reasoning system that provides fast approximate querying over large knowledge bases using hypervector algebra. Built on the VSAX library for GPU-accelerated VSA operations.
Key Features
- Fast approximate querying: Query 10^6+ facts with subsymbolic retrieval
- VSARL language: Declarative syntax for facts, queries, and rules
- Interactive REPL: Load files and query interactively
- CLI interface: Simple commands for ingestion, querying, and export
- Multiple formats: Load facts from CSV, JSONL, or VSAR files
- Trace layer: Explanation DAG for debugging and transparency
- Deterministic results: Reproducible outputs with fixed seeds
- HDF5 persistence: Save and load knowledge bases
- Comprehensive testing: 295 tests with 98.6% coverage
Quick Start
Installation
Option 1: Install from PyPI (recommended for users)
pip install vsar
# Verify installation
vsar --help
Option 2: Development install with uv
# Install uv
pip install uv
# Clone and install
git clone https://github.com/vasanthsarathy/vsar.git
cd vsar
uv sync
# For development, use uv run
uv run vsar --help
Hello World - CLI
Create a simple VSAR program family.vsar:
@model FHRR(dim=8192, seed=42);
@threshold(value=0.22);
fact parent(alice, bob).
fact parent(alice, carol).
fact parent(bob, dave).
fact parent(carol, eve).
query parent(alice, X)?
query parent(X, dave)?
Run it:
# After pip install vsar
vsar run family.vsar
# Or during development with uv
uv run vsar run family.vsar
Output:
Inserted 4 facts
┌─────────────────────────┐
│ Query: parent(alice, X) │
├────────┬────────────────┤
│ Entity │ Score │
├────────┼────────────────┤
│ bob │ 0.9234 │
│ carol │ 0.9156 │
└────────┴────────────────┘
Interactive REPL
Start an interactive session to load files and query on the fly:
vsar repl
Example session:
VSAR Interactive REPL
Type 'help' for commands, 'exit' to quit
> load family.vsar
Loaded family.vsar
Inserted 4 facts
> query parent(alice, X)?
┌─────────────────────────┐
│ Query: parent(alice, X) │
├────────┬────────────────┤
│ Entity │ Score │
├────────┼────────────────┤
│ bob │ 0.9234 │
│ carol │ 0.9156 │
└────────┴────────────────┘
> query parent(X, dave)?
┌───────────────────────┐
│ Query: parent(X, dave)│
├────────┬──────────────┤
│ Entity │ Score │
├────────┼──────────────┤
│ bob │ 0.8876 │
└────────┴──────────────┘
> stats
Knowledge Base Statistics
Total Facts: 4
Predicates: parent (4 facts)
> exit
Goodbye!
CLI Commands
Ingest Facts
# From CSV (predicate in first column)
vsar ingest facts.csv --kb family.h5
# From CSV (all rows same predicate)
vsar ingest parents.csv --predicate parent --kb family.h5
# From JSONL
vsar ingest facts.jsonl --kb family.h5
Query and Export
# Export KB to JSON
vsar export family.h5 --format json --output facts.json
# Export to JSONL
vsar export family.h5 --format jsonl --output facts.jsonl
# Inspect KB statistics
vsar inspect family.h5
Advanced Options
# JSON output for scripting
vsar run program.vsar --json
# Show trace DAG
vsar run program.vsar --trace
# Limit results per query
vsar run program.vsar --k 10
VSARL Language
Directives
Configure the reasoning engine:
// Model configuration
@model FHRR(dim=8192, seed=42); // FHRR backend, 8192 dimensions
@model MAP(dim=4096, seed=100); // MAP backend (alternative)
// Retrieval parameters
@threshold(value=0.22); // Similarity threshold
@beam(width=50); // Beam width (Phase 2)
Facts
Ground atoms (all arguments are constants):
fact parent(alice, bob).
fact parent(bob, carol).
fact lives_in(alice, boston).
fact transfer(alice, bob, money). // Ternary fact
fact person(alice). // Unary fact
Queries
Single-atom queries with one variable (Phase 1):
query parent(alice, X)? // Find children of alice
query parent(X, carol)? // Find parents of carol
query lives_in(X, boston)? // Who lives in boston?
query transfer(alice, X, money)? // Alice transferred money to X
Phase 1 Limitation: Only single-variable, single-atom queries supported. Conjunctive queries coming in Phase 2.
Comments
// Single-line comment
/* Multi-line
comment */
File Formats
CSV Format
With predicate column (first column = predicate):
parent,alice,bob
parent,bob,carol
lives_in,alice,boston
Without predicate (use --predicate flag):
alice,bob
bob,carol
JSONL Format
One fact per line:
{"predicate": "parent", "args": ["alice", "bob"]}
{"predicate": "parent", "args": ["bob", "carol"]}
{"predicate": "lives_in", "args": ["alice", "boston"]}
VSAR Format
Native .vsar program files (see VSARL Language above).
Python API
High-Level API (Recommended)
from vsar.language.ast import Directive, Fact, Query
from vsar.language.loader import load_facts
from vsar.semantics.engine import VSAREngine
# Create engine from directives
directives = [
Directive(name="model", params={"type": "FHRR", "dim": 512, "seed": 42})
]
engine = VSAREngine(directives)
# Load and insert facts
facts = load_facts("facts.csv")
for fact in facts:
engine.insert_fact(fact)
# Execute query
query = Query(predicate="parent", args=["alice", None])
result = engine.query(query, k=5)
for entity, score in result.results:
print(f"{entity}: {score:.4f}")
# Inspect trace
trace = engine.trace.get_dag()
for event in trace:
print(f"{event.type}: {event.payload}")
# Save KB
engine.save_kb("family.h5")
Low-Level API (Phase 0 Foundation)
from vsar.kernel.vsa_backend import FHRRBackend
from vsar.symbols.registry import SymbolRegistry
from vsar.encoding.vsa_encoder import VSAEncoder
from vsar.encoding.roles import RoleVectorManager
from vsar.kb.store import KnowledgeBase
from vsar.retrieval.query import Retriever
# Create VSA system
backend = FHRRBackend(dim=512, seed=42)
registry = SymbolRegistry(backend, seed=42)
encoder = VSAEncoder(backend, registry, seed=42)
kb = KnowledgeBase(backend)
role_manager = RoleVectorManager(backend, seed=42)
retriever = Retriever(backend, registry, kb, encoder, role_manager)
# Insert facts
atom_vec = encoder.encode_atom("parent", ["alice", "bob"])
kb.insert("parent", atom_vec, ("alice", "bob"))
# Query: parent(alice, X)
results = retriever.retrieve("parent", 2, {"1": "alice"}, k=5)
print(results) # [('bob', 0.85), ...]
Architecture
VSAR uses a layered architecture:
Phase 0 Layers (Foundation)
- Kernel (
vsar.kernel): VSA operations (FHRR/MAP backends via VSAX) - Symbols (
vsar.symbols): Typed symbol spaces (E, R, A, C, T, S) with basis management - Encoding (
vsar.encoding): Role-filler binding for atoms (predicate + arguments) - KB (
vsar.kb): Predicate-partitioned storage with HDF5 persistence - Retrieval (
vsar.retrieval): Unbinding, cleanup, top-k similarity search
Phase 1 Layers (Language & CLI)
- Language (
vsar.language): VSARL parser (Lark), AST, loaders (CSV/JSONL/VSAR) - Semantics (
vsar.semantics): VSAREngine orchestrating all layers - Trace (
vsar.trace): Explanation DAG for transparency - CLI (
vsar.cli): Typer-based commands with Rich formatting
See docs/architecture.md for complete details.
Project Status
✅ Phase 0 (Foundation) - COMPLETE
- ✅ Kernel backend (FHRR VSA via VSAX)
- ✅ Symbol space management (6 typed spaces)
- ✅ Atom encoding (role-filler binding)
- ✅ KB storage (predicate-partitioned bundles)
- ✅ Retrieval primitive (unbind → cleanup)
- ✅ HDF5 persistence (KB + basis)
- ✅ Published to PyPI (v0.1.0)
✅ Phase 1 (Language & CLI) - COMPLETE
- ✅ VSARL parser (facts, queries, directives)
- ✅ Facts ingestion (CSV/JSONL/VSAR)
- ✅ Program execution engine
- ✅ Trace layer (explanation DAG)
- ✅ CLI interface (run, ingest, export, inspect)
- ✅ 281 tests, 98.5% coverage
🔜 Phase 2 (Rules & Chaining)
- Rule definitions (
rule grandparent(X,Z) :- parent(X,Y), parent(Y,Z).) - Bounded forward chaining
- Conjunctive queries
- Stratified negation
🔜 Phase 3 (Optimizations)
- Indexing strategies
- Query planning
- Parallel execution
- Web interface
Examples
Example 1: Family Tree
@model FHRR(dim=8192, seed=42);
fact parent(alice, bob).
fact parent(bob, carol).
fact parent(carol, dave).
query parent(alice, X)? // Returns: bob (0.92)
query parent(X, carol)? // Returns: bob (0.88)
Example 2: Knowledge Graph
@model FHRR(dim=8192, seed=42);
@threshold(value=0.25);
fact lives_in(alice, boston).
fact lives_in(bob, cambridge).
fact works_at(alice, mit).
fact works_at(bob, harvard).
query lives_in(X, boston)? // Returns: alice
query works_at(alice, X)? // Returns: mit
Example 3: Large-Scale Ingestion
# Ingest 1M facts from CSV
vsar ingest large_dataset.csv \
--kb large.h5 \
--dim 8192 \
--seed 42
# Query the KB
vsar run queries.vsar --k 10
Performance
Approximate query performance (Phase 1):
- 10^3 facts: <100ms per query
- 10^4 facts: <200ms per query
- 10^5 facts: <500ms per query
- 10^6 facts: <1s per query
Measured on AMD EPYC 7742 CPU with dim=8192
Testing
VSAR has comprehensive test coverage:
# Run all tests
uv run pytest
# Run with coverage
uv run pytest --cov=vsar --cov-report=html
# Run specific suites
uv run pytest tests/unit/ # Unit tests
uv run pytest tests/integration/ # Integration tests
Test statistics:
- 281 tests (all passing)
- 98.5% coverage
- Unit tests: 261
- Integration tests: 20
Development
# Install development dependencies
uv sync --all-groups
# Run formatters
uv run black .
uv run ruff check . --fix
# Type checking
uv run mypy src/vsar
# Pre-commit hooks
uv run pre-commit install
uv run pre-commit run --all-files
# Build documentation
cd docs && uv run mkdocs serve
Documentation
- Architecture Overview - System design and layer details
- Getting Started - Tutorials and examples
- API Reference - Complete API documentation
- CLAUDE.md - Developer workflow guide
Contributing
Contributions welcome! See CONTRIBUTING.md for guidelines.
Citation
If you use VSAR in your research, please cite:
@software{vsar2025,
title = {VSAR: VSA-grounded Reasoning},
author = {VSAR Contributors},
year = {2025},
url = {https://github.com/your-org/vsar}
}
License
MIT License - see LICENSE for details.
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vsar-0.2.5.tar.gz.
File metadata
- Download URL: vsar-0.2.5.tar.gz
- Upload date:
- Size: 37.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b278d94e2f23604dc42d144c8e7b751205441da1e78b8c0f131b7c63d41b246d
|
|
| MD5 |
b5cc55bb20c3e4cb5de1b0c97b3eb76a
|
|
| BLAKE2b-256 |
1391f4b3b7a5f6703b1930896b4fcd859925d135711df3113a1b0ba611a76895
|
Provenance
The following attestation bundles were made for vsar-0.2.5.tar.gz:
Publisher:
publish.yml on vasanthsarathy/vsar
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vsar-0.2.5.tar.gz -
Subject digest:
b278d94e2f23604dc42d144c8e7b751205441da1e78b8c0f131b7c63d41b246d - Sigstore transparency entry: 782261303
- Sigstore integration time:
-
Permalink:
vasanthsarathy/vsar@2c274e25851002d814aea82d5bb1e132fde25b94 -
Branch / Tag:
refs/tags/v0.2.5 - Owner: https://github.com/vasanthsarathy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2c274e25851002d814aea82d5bb1e132fde25b94 -
Trigger Event:
push
-
Statement type:
File details
Details for the file vsar-0.2.5-py3-none-any.whl.
File metadata
- Download URL: vsar-0.2.5-py3-none-any.whl
- Upload date:
- Size: 43.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7ce6c911e85fa3d5395b2bc6f97ae8ca1d29c0a6ac457a48ac36bebb4b345934
|
|
| MD5 |
9be6c946b34a02a85237f655fadfb615
|
|
| BLAKE2b-256 |
87f3dbb9cc21d1eb0d67e86ab44019f5e6608a41017f485a16be7d1fafba17ad
|
Provenance
The following attestation bundles were made for vsar-0.2.5-py3-none-any.whl:
Publisher:
publish.yml on vasanthsarathy/vsar
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vsar-0.2.5-py3-none-any.whl -
Subject digest:
7ce6c911e85fa3d5395b2bc6f97ae8ca1d29c0a6ac457a48ac36bebb4b345934 - Sigstore transparency entry: 782261304
- Sigstore integration time:
-
Permalink:
vasanthsarathy/vsar@2c274e25851002d814aea82d5bb1e132fde25b94 -
Branch / Tag:
refs/tags/v0.2.5 - Owner: https://github.com/vasanthsarathy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@2c274e25851002d814aea82d5bb1e132fde25b94 -
Trigger Event:
push
-
Statement type: