VSAR: VSA-grounded reasoning with approximate joins
Project description
VSAR: VSA-grounded Reasoning
VSAR (VSAX Reasoner) is a VSA-grounded reasoning system that combines Datalog-style logic programming with approximate vector matching. Built on VSAX library for GPU-accelerated hypervector operations, VSAR enables fast approximate reasoning over large knowledge bases with explainable results.
Think of it as: "Datalog meets vector similarity search" - a foundation for approximate deductive reasoning at scale.
๐ Key Features
Deductive Reasoning
- Horn clause rules - Full support for
head :- body1, body2, ...syntax - Forward chaining - Iterative rule application with fixpoint detection
- Transitive closure - Multi-hop inference (arbitrary depth)
- Semi-naive evaluation - Optimized chaining that avoids redundant work
Approximate Matching
- VSA-based similarity - Fuzzy matching with confidence scores instead of exact symbolic matching
- Graceful degradation - Works with noisy data and typos
- Beam search joins - Prevents combinatorial explosion in multi-body rules
- Novelty detection - Prevents duplicate derivations via similarity threshold
Performance & Scale
- Fast approximate querying - Query 10^6+ facts with subsymbolic retrieval
- Vectorized operations - GPU-ready via JAX backend
- Predicate partitioning - Efficient KB organization
- HDF5 persistence - Save and load knowledge bases
Developer Experience
- VSARL language - Declarative syntax for facts, queries, and rules
- Interactive REPL - Load files and query interactively
- CLI interface - Simple commands for ingestion, querying, and export
- Full traceability - Explanation DAG for debugging and transparency
- Comprehensive testing - 392 tests with 97.56% coverage
๐ฆ Installation
From PyPI (Recommended)
pip install vsar
# Verify installation
vsar --help
Development Install
# Install uv (fast Python package installer)
pip install uv
# Clone and install
git clone https://github.com/vasanthsarathy/vsar.git
cd vsar
uv sync
# For development, use uv run
uv run vsar --help
๐ Quick Start
Example 1: Basic Facts and Queries
Create family.vsar:
@model FHRR(dim=1024, seed=42);
fact parent(alice, bob).
fact parent(bob, carol).
fact parent(carol, dave).
query parent(alice, X)?
query parent(X, carol)?
Run it:
vsar run family.vsar
Output:
Inserted 3 facts
โโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Query: parent(alice, X) โ
โโโโโโโโโโฌโโโโโโโโโโโโโโโโโค
โ Entity โ Score โ
โโโโโโโโโโผโโโโโโโโโโโโโโโโโค
โ bob โ 0.9234 โ
โโโโโโโโโโดโโโโโโโโโโโโโโโโโ
Example 2: Reasoning with Rules (Phase 2)
Create reasoning.vsar:
@model FHRR(dim=1024, seed=42);
@beam(width=50);
@novelty(threshold=0.95);
// Base facts: Parent relationships
fact parent(alice, bob).
fact parent(bob, carol).
fact parent(carol, dave).
// Rule: Derive grandparent relationship
rule grandparent(X, Z) :- parent(X, Y), parent(Y, Z).
// Rule: Transitive closure for ancestors
rule ancestor(X, Y) :- parent(X, Y).
rule ancestor(X, Z) :- parent(X, Y), ancestor(Y, Z).
// Queries
query grandparent(alice, X)?
query ancestor(alice, X)?
Run it:
vsar run reasoning.vsar
Output:
Inserted 3 facts
Applied 3 rules in 2 iterations
Derived 5 new facts
Fixpoint reached: true
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Query: grandparent(alice, X) โ
โโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโค
โ Entity โ Score โ
โโโโโโโโโโผโโโโโโโโโโโโโโโโโโโโโโค
โ carol โ 0.8456 โ
โโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Query: ancestor(alice, X) โ
โโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโค
โ Entity โ Score โ
โโโโโโโโโโผโโโโโโโโโโโโโโโโโโโค
โ bob โ 0.9234 โ
โ carol โ 0.8876 โ
โ dave โ 0.8123 โ
โโโโโโโโโโดโโโโโโโโโโโโโโโโโโโ
Example 3: Interactive REPL
vsar repl
Example session:
VSAR Interactive REPL
Type 'help' for commands, 'exit' to quit
> load family.vsar
Loaded family.vsar
Inserted 3 facts
> query parent(alice, X)?
โโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Query: parent(alice, X) โ
โโโโโโโโโโฌโโโโโโโโโโโโโโโโโค
โ Entity โ Score โ
โโโโโโโโโโผโโโโโโโโโโโโโโโโโค
โ bob โ 0.9234 โ
โโโโโโโโโโดโโโโโโโโโโโโโโโโโ
> stats
Knowledge Base Statistics
Total Facts: 3
Predicates: parent (3 facts)
> exit
Goodbye!
๐ Documentation
Tutorials
- Getting Started - Your first VSAR program
- Tutorial: Family Tree Reasoning - Multi-hop inference
- Tutorial: Organizational Hierarchies - Manager chains
- Tutorial: Knowledge Graphs - Multi-relation reasoning
User Guides
- VSARL Language Reference - Complete syntax guide
- CLI Commands - Command-line interface
- Python API - Programmatic usage
- Architecture Overview - System design
Reference
- API Reference - Complete API documentation
- Examples Directory - 6 example programs with explanations
- PROGRESS.md - Current capabilities and limitations
- CHANGELOG.md - Version history
๐ฏ What Can VSAR Do?
โ Currently Supported (Phase 0-2)
Deductive Reasoning:
- โ Ground facts insertion and querying
- โ
Horn clause rules (
head :- body1, body2, ...) - โ Forward chaining with fixpoint detection
- โ Multi-hop inference (transitive closure)
- โ Recursive rules (arbitrary depth)
- โ Multiple interacting rules
- โ Semi-naive evaluation optimization
Approximate Reasoning:
- โ Similarity-based retrieval (fuzzy matching)
- โ Confidence scores for all results
- โ Graceful degradation under noise
- โ Top-k ranked results
Performance:
- โ Beam search joins (controls combinatorial explosion)
- โ Novelty detection (prevents duplicates)
- โ Vectorized operations (GPU-ready)
- โ Predicate partitioning
Developer Experience:
- โ Declarative VSARL language
- โ Interactive REPL
- โ CLI interface
- โ Full traceability and provenance
- โ HDF5 persistence
โณ Limitations (Planned for Phase 3+)
- โณ Single-variable queries only -
parent(alice, ?)works,parent(?, ?)doesn't yet - โณ No negation - Cannot express
not enemy(X, Y)or negation-as-failure - โณ No aggregation - Cannot count, sum, max, etc.
- โณ Forward chaining only - No backward chaining or goal-directed search
- โณ No magic sets - Cannot optimize query-driven derivation
See PROGRESS.md for detailed capability analysis and roadmap.
๐๏ธ Architecture
VSAR uses a layered architecture:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ VSARL Language & CLI โ (Phase 1)
โ Parser, AST, Engine, Trace, CLI โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Semantic Layer (Reasoning) โ (Phase 2)
โ Substitution, Joins, Chaining, Rules โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Retrieval & Query Execution โ (Phase 0)
โ Unbinding, Cleanup, Top-k Retrieval โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Knowledge Base (Storage) โ (Phase 0)
โ Predicate Bundles, HDF5 Persistence โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Encoding (Role-Filler Binding) โ (Phase 0)
โ Atom Encoding, Role Vector Management โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ Symbol Registry (Typed Spaces) โ (Phase 0)
โ E, R, A, C, T, S spaces + Basis Mgmt โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ VSA Kernel (Hypervector Algebra) โ (Phase 0)
โ FHRR, MAP Backends via VSAX โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Key Principles:
- Approximate is explicit - Every result has a similarity score
- Modular semantics - Clean separation of concerns
- Bounded inference - Beam widths, hop limits, novelty thresholds
- Typed symbol spaces - Entities (E), Relations (R), Attributes (A), etc.
๐ VSARL Language
Facts
fact parent(alice, bob).
fact parent(bob, carol).
fact lives_in(alice, boston).
fact transfer(alice, bob, money). // Ternary fact
fact person(alice). // Unary fact
Rules (Phase 2)
// Grandparent: X is grandparent of Z if X is parent of Y and Y is parent of Z
rule grandparent(X, Z) :- parent(X, Y), parent(Y, Z).
// Ancestor: Base case
rule ancestor(X, Y) :- parent(X, Y).
// Ancestor: Recursive case (transitive closure)
rule ancestor(X, Z) :- parent(X, Y), ancestor(Y, Z).
// Sibling: Share same parent
rule sibling(X, Y) :- parent(Z, X), parent(Z, Y).
Queries
query parent(alice, X)? // Find children of alice
query parent(X, carol)? // Find parents of carol
query grandparent(alice, X)? // Find grandchildren of alice (via rules)
query ancestor(alice, X)? // Find all descendants (transitive)
Directives
// Model configuration
@model FHRR(dim=1024, seed=42); // FHRR backend, 1024 dimensions
@model MAP(dim=512, seed=100); // MAP backend (alternative)
// Retrieval parameters
@threshold 0.22; // Similarity threshold
@beam(width=50); // Beam width for joins
@novelty(threshold=0.95); // Novelty detection threshold
Comments
// Single-line comment
/* Multi-line
comment */
๐ง CLI Reference
Run Programs
# Run a VSAR program
vsar run program.vsar
# Limit results per query
vsar run program.vsar --k 10
# JSON output (for scripting)
vsar run program.vsar --json
# Show trace DAG
vsar run program.vsar --trace
Ingest Facts
# From CSV (predicate in first column)
vsar ingest facts.csv --kb family.h5
# From CSV (specify predicate)
vsar ingest parents.csv --predicate parent --kb family.h5
# From JSONL
vsar ingest facts.jsonl --kb family.h5
Export & Inspect
# Export KB to JSON
vsar export family.h5 --format json --output facts.json
# Export to JSONL
vsar export family.h5 --format jsonl --output facts.jsonl
# Inspect KB statistics
vsar inspect family.h5
Interactive REPL
# Start interactive session
vsar repl
# Available commands:
# - load <file> Load a VSAR program
# - query <query> Execute a query
# - stats Show KB statistics
# - help Show help
# - exit Exit REPL
๐ Python API
High-Level API (Recommended)
from vsar.language.ast import Directive, Fact, Query, Rule, Atom
from vsar.semantics.engine import VSAREngine
# Configure engine
directives = [
Directive(name="model", params={"type": "FHRR", "dim": 1024, "seed": 42}),
Directive(name="beam", params={"width": 50}),
Directive(name="novelty", params={"threshold": 0.95}),
]
engine = VSAREngine(directives)
# Insert facts
engine.insert_fact(Fact(predicate="parent", args=["alice", "bob"]))
engine.insert_fact(Fact(predicate="parent", args=["bob", "carol"]))
# Define rules
rules = [
Rule(
head=Atom(predicate="grandparent", args=["X", "Z"]),
body=[
Atom(predicate="parent", args=["X", "Y"]),
Atom(predicate="parent", args=["Y", "Z"]),
],
)
]
# Query with automatic rule application
query = Query(predicate="grandparent", args=["alice", None])
result = engine.query(query, rules=rules, k=10)
for entity, score in result.results:
print(f"{entity}: {score:.4f}")
# Inspect trace
trace = engine.trace.get_dag()
for event in trace:
print(f"{event.type}: {event.payload}")
# Get KB statistics
stats = engine.stats()
print(f"Total facts: {stats['total_facts']}")
# Save/load KB
engine.save_kb("family.h5")
engine.load_kb("family.h5")
Forward Chaining
from vsar.semantics.chaining import apply_rules
# Apply rules with forward chaining
result = apply_rules(
engine,
rules,
max_iterations=100,
k=10,
semi_naive=True # Use semi-naive evaluation (faster)
)
print(f"Iterations: {result.iterations}")
print(f"Total derived: {result.total_derived}")
print(f"Fixpoint reached: {result.fixpoint_reached}")
Loading from Files
from vsar.language.loader import ProgramLoader
# Load VSAR program
loader = ProgramLoader()
program = loader.load_file("examples/02_family_tree.vsar")
# Create engine from program directives
engine = VSAREngine(program.directives)
# Insert all facts
for fact in program.facts:
engine.insert_fact(fact)
# Execute all queries with rules
for query in program.queries:
result = engine.query(query, rules=program.rules, k=10)
print(f"Query: {query.predicate}({', '.join(str(a) for a in query.args)})")
print(f"Results: {result.results}")
๐ Performance
Approximate query performance (Phase 2, with rules):
| Facts | Query Time | Chaining Time (10 rules) |
|---|---|---|
| 10^3 | <50ms | <200ms |
| 10^4 | <100ms | <500ms |
| 10^5 | <300ms | <2s |
| 10^6 | <800ms | <10s |
Measured on AMD EPYC 7742 CPU with dim=1024, beam=50
Memory usage:
- Base: ~50MB (dim=1024)
- Per 1000 facts: ~5MB
- Scales linearly with fact count and dimensionality
๐งช Testing
# Run all tests
pytest
# Run with coverage
pytest --cov=vsar --cov-report=html
# Run specific test suites
pytest tests/unit/ # Unit tests
pytest tests/integration/ # Integration tests
pytest tests/integration/test_e2e_phase2.py # End-to-end tests
Test Statistics:
- 392 tests (all passing, 4 skipped)
- 97.56% coverage
- Unit tests: 370
- Integration tests: 22
- End-to-end tests: 5
๐บ๏ธ Project Status & Roadmap
โ Phase 0: Foundation (Complete)
- VSA kernel (FHRR, MAP backends)
- Symbol registry and encoding
- KB storage with predicate partitioning
- Basic retrieval with similarity search
- HDF5 persistence
โ Phase 1: Ground KB + Queries (Complete)
- VSARL language parser
- Facts ingestion (CSV/JSONL/VSAR)
- Single-variable queries
- CLI interface and REPL
- Trace collection
โ Phase 2: Horn Rules + Chaining (Complete)
- Horn clause rules
- Variable substitution and unification
- Beam search joins
- Forward chaining with fixpoint detection
- Semi-naive evaluation
- Novelty detection
- Query with automatic rule application
๐ Phase 3: Advanced Features (Planned)
- Multi-variable queries (
parent(?, ?)?) - Stratified negation
- Aggregation (count, sum, max)
- Backward chaining
- Magic sets optimization
๐ Phase 4: Scale & Performance (Planned)
- Incremental maintenance
- Query planning and optimization
- Parallel execution
- GPU acceleration
See PROGRESS.md for detailed status and comparison to other reasoners.
๐ก Use Cases
Best suited for:
- Knowledge graph reasoning with noise tolerance
- Transitive closure queries (org hierarchies, supply chains)
- Multi-hop reasoning (family trees, social networks)
- Explainable AI (need provenance and similarity scores)
- Large-scale approximate reasoning (vectorized operations)
Not yet suitable for:
- Complex logical puzzles requiring negation
- Planning problems (need backward chaining)
- Ontology reasoning (need DL features)
- Answer set programming tasks
๐ค Contributing
Contributions welcome! Please see CONTRIBUTING.md for guidelines.
Development setup:
# Clone repo
git clone https://github.com/vasanthsarathy/vsar.git
cd vsar
# Install with dev dependencies
uv sync --all-groups
# Run tests
uv run pytest
# Format code
uv run black .
uv run ruff check . --fix
# Type check
uv run mypy src/vsar
๐ License
MIT License - see LICENSE for details.
๐ Citation
If you use VSAR in your research, please cite:
@software{vsar2025,
title = {VSAR: VSA-grounded Reasoning},
author = {Sarathy, Vasanth},
year = {2025},
url = {https://github.com/vasanthsarathy/vsar},
version = {0.3.0}
}
๐ Acknowledgments
- Built on VSAX for VSA operations
- Inspired by Datalog, Prolog, and logic programming systems
- Uses Lark for parsing
- CLI powered by Typer and Rich
- Testing with pytest
๐ Support
- Documentation: docs/
- Examples: examples/
- Issues: GitHub Issues
- Discussions: GitHub Discussions
Made with โค๏ธ for approximate reasoning at scale.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file vsar-0.3.3.tar.gz.
File metadata
- Download URL: vsar-0.3.3.tar.gz
- Upload date:
- Size: 49.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eddd4deddda9a9a6dae3758a4772f819254aa2f00c8c2cfdce94def0d091e1aa
|
|
| MD5 |
4e1b41ecf60c68539f681d5d1a3be856
|
|
| BLAKE2b-256 |
c44f8d037b4050dc885775899035dc4f3034e550596f1b2ecb12b108072d1e4c
|
Provenance
The following attestation bundles were made for vsar-0.3.3.tar.gz:
Publisher:
publish.yml on vasanthsarathy/vsar
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vsar-0.3.3.tar.gz -
Subject digest:
eddd4deddda9a9a6dae3758a4772f819254aa2f00c8c2cfdce94def0d091e1aa - Sigstore transparency entry: 785894938
- Sigstore integration time:
-
Permalink:
vasanthsarathy/vsar@04df3fd944e2ac7f3ff2f5e7cf35797c7ed9a124 -
Branch / Tag:
refs/tags/v0.3.3 - Owner: https://github.com/vasanthsarathy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@04df3fd944e2ac7f3ff2f5e7cf35797c7ed9a124 -
Trigger Event:
push
-
Statement type:
File details
Details for the file vsar-0.3.3-py3-none-any.whl.
File metadata
- Download URL: vsar-0.3.3-py3-none-any.whl
- Upload date:
- Size: 54.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51369f064761b436481260eb9019c52432d3ca79d47105d0331e0249870e84f4
|
|
| MD5 |
a0fa2b9792772b283771fbf811ec531d
|
|
| BLAKE2b-256 |
f0946d841f398a61438bc64b4ce17ec2017bb102a95d5e0a707f3b9920d2785e
|
Provenance
The following attestation bundles were made for vsar-0.3.3-py3-none-any.whl:
Publisher:
publish.yml on vasanthsarathy/vsar
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
vsar-0.3.3-py3-none-any.whl -
Subject digest:
51369f064761b436481260eb9019c52432d3ca79d47105d0331e0249870e84f4 - Sigstore transparency entry: 785894955
- Sigstore integration time:
-
Permalink:
vasanthsarathy/vsar@04df3fd944e2ac7f3ff2f5e7cf35797c7ed9a124 -
Branch / Tag:
refs/tags/v0.3.3 - Owner: https://github.com/vasanthsarathy
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@04df3fd944e2ac7f3ff2f5e7cf35797c7ed9a124 -
Trigger Event:
push
-
Statement type: