Skip to main content

MCard: Local-first Content Addressable Storage with Content Type Detection

Project description

Python 3.10+ MIT License ruff Build Status

MCard

MCard is a local-first, content-addressable storage platform with cryptographic integrity, temporal ordering, and a Polynomial Type Runtime (PTR) that orchestrates polyglot execution. It gives teams a verifiable data backbone without sacrificing developer ergonomics or observability.


Highlights

  • 🔐 Hash-verifiable storage: Unified network of relationships via SHA-256 hashing across content, handles, and history.
  • ♾️ Universal Substrate: Emulates the Turing Machine "Infinitely Long Tape" via relational queries for computable DSLs.
  • ♻️ Deterministic execution: PTR mediates 8 polyglot runtimes (Python, JavaScript, Rust, C, WASM, Lean, R, Julia).
  • 📊 Enterprise ready: Structured logging, CI/CD pipeline, security auditing, 99%+ automated test coverage.
  • 🧠 AI-native extensions: GraphRAG engine, optional LLM runtime, and optimized multimodal vision (moondream).
  • ⚛️ Quantum NLP: Optional lambeq + PyTorch integration for pregroup grammar and quantum circuit compilation.
  • 🧰 Developer friendly: Rich Python API, TypeScript SDK, BMAD-driven TDD workflow, numerous examples.
  • 📐 Algorithm Benchmarks: Sine comparison (Taylor vs Chebyshev) across Python, C, and Rust.
  • High Performance: Optimized test suite (~37s) with runtime caching and session-scoped fixtures.
  • 🦆 DuckDB Engine: Optional columnar OLAP storage backend — same StorageEngine interface, ideal for analytical workloads and Parquet I/O.
  • 📋 Single Source of Truth Schema: Both SQLite and DuckDB engines load schema exclusively from canonical SQL files (mcard_schema.sql, mcard_vector_schema.sql) — zero hardcoded CREATE TABLE statements.
  • 🔄 Shared MIME Registry: A single mime_extensions.json drives content-type detection across both Python and TypeScript — edit one file to update both runtimes, no recompilation needed.

For the long-form narrative and chapter roadmap, see docs/theory/Narrative_Roadmap.md. Architectural philosophy is captured in docs/architecture/Monadic_Duality.md.


Quick Start (Python)

git clone https://github.com/xlp0/MCard_TDD.git
cd MCard_TDD
make setup-dev              # creates .venv with uv, installs all deps + pre-commit
uv run pytest -q -m "not slow"  # run the fast Python test suite
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic/addition.yaml

Development Setup

This project uses uv as the sole Python dependency manager. All dependencies are defined in pyproject.toml and locked in uv.lock.

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create venv and install all dependencies (including dev)
make setup-dev
# Or manually:
uv venv --prompt MCard_TDD
uv sync --all-extras --dev

# Run commands via uv
uv run pytest           # run tests
uv run ruff check mcard/  # lint
uv run python script.py   # run any script

Create and retrieve a card:

from mcard import MCard, default_collection

card = MCard("Hello MCard")
hash_value = default_collection.add(card)
retrieved = default_collection.get(hash_value)
print(retrieved.get_content(as_text=True))

Quick Start (JavaScript / WASM)

See mcard-js/README.md for build, testing, and npm publishing instructions for the TypeScript implementation.

Quick Start (Quantum NLP)

MCard optionally integrates with lambeq for quantum natural language processing using pregroup grammar:

# Install with Quantum NLP support (requires Python 3.10+)
uv pip install -e ".[qnlp]"

# Parse a sentence into a pregroup grammar diagram
uv run python scripts/lambeq_web.py "John gave Mary a flower"

Example output (pregroup types):

John: n    gave: n.r @ s @ n.l @ n.l    Mary: n    a flower: n
Result: s (grammatically valid sentence)

The pregroup diagrams can be compiled to quantum circuits for QNLP experiments.


Polyglot Runtime Matrix

Runtime Status Notes
Python Reference implementation, CLM runner
JavaScript Node + browser (WASM) + Full RAG Support + Pyodide
Rust High-performance adapter & WASM target
C Low-level runtime integration
WASM Edge and sandbox execution
Lean ⚙️ Formal verification pipeline (requires lean-toolchain)
R Statistical computing runtime
Julia High-performance scientific computing

⚠️ Lean Configuration: A lean-toolchain file in the project root is critical. Without it, elan will attempt to resolve/download toolchain metadata on every invocation, causing CLM execution to hang or become unbearably slow.

Compiling Native Binaries

For detailed instructions on compiling the required C, Rust, and WASM binaries for the polyglot tests, please see the Compiling Native Binaries Guide.


Project Structure (abridged)

MCard_TDD/
├── mcard/            # Python package (engines, models, PTR)
├── mcard-js/         # TypeScript SDK — 3 interchangeable storage engines, PTR, RAG
├── mcard-studio/     # [Submodule] Astro + React PWA — artifact IDE with VCard events
├── LandingPage/      # [Submodule] Static-first P2P Documentation & Landing Portal
├── chapters/         # CLM specifications (polyglot demos)
├── docs/             # Architecture, PRD, guides, reports
├── scripts/          # Automation & demo scripts
├── tests/            # >815 automated tests (Python)
├── mime_extensions.json  # Shared MIME-type registry (Python + TypeScript)
└── pyproject.toml    # uv-managed dependencies (uv.lock)

Submodule Organization

This repository orchestrates two key frontend applications as submodules, each serving a distinct role in the MCard ecosystem:

1. mcard-studio (/mcard-studio)

The Interactive IDE for Eventual Consistency & Eventual Correctness. A Progressive Web App (PWA) built with Astro and React. It serves as the primary interface for creating, editing, and executing MCards and CLMs.

  • Tech Stack: Astro, React, Zustand, Monaco Editor, Anime.js/GSAP, Mermaid.
  • Key Features: Four-store persistence (servermemory.db, browsermemory.db, execution_logs.db, filesystem), dual-mode CLM execution (browser-first JS → server fallback), VCard event pipeline with result sealing, inline rename/upload/create, version history with time-travel, native AI assistant (Ollama), 30+ file type renderers.
  • Role: The "Editor" & "Runtime" environment for developers and power users.
  • Test Results: 372 tests passed (37 test files).

2. LandingPage (/LandingPage)

The Public Portal & Knowledge Container. A static-first modular web application designed for decentralized distribution. It focuses on P2P communication, documentation rendering, and interactive 3D visualizations.

  • Tech Stack: Vanilla JS Modules, WebRTC (No signaling server), Three.js, KaTeX, Mermaid.
  • Key Features: Serverless P2P mesh networking, zero-dependency architecture (runs locally without build steps), and rich markdown/media rendering.
  • Role: The "Viewer" & "distributable container" for the Personal Knowledge Container (PKC) concept.

Documentation

The theoretical foundations, including the Function Economy, Petri Net Scheduler, Dual-Handle Memory Architecture, and Concurrency Protection, have been consolidated into the Platform Vision Document. %20Petri%20Net%20Implementation.md) — Physical implementation mapping


Recent Updates

Full changelog: CHANGELOG.md

Current versions: Python mcard 0.1.60 · TypeScript mcard-js 2.1.43

Polyglot Runtime & WASM Integration Fixes (v0.1.60 / v2.1.43): Fixed critical missing --experimental-wasm-modules environment flags in Python subprocess integration, rectified module:// protocol resolution for cross-environment testing, and enforced importlib dynamic module bootstrapping in NodeJS. Patched Rust/TypeScript content-type string detection parity bugs and Python 3.9 __future__ typing incompatibilities.

Python Build & Syntax Corrections (v0.1.59 / v2.1.42): Fixed mcard syntax, namespace, and import dependency compilation errors across improved_logging.py, card_collection.py, and logging_config.py. Restored 100% test build health.

Storage Layer Deduplication (v0.1.58 / v2.1.42): Comprehensive refactoring to centralize card operations into AbstractSqlEngine (TypeScript) and simplify connection paths via resolve_db_path() (Python). Eliminated over 250 lines of duplicate code across the 4 TypeScript SQL engines, standardized dialect handling (SQLite TEXT vs DuckDB VARCHAR), and fixed a latent foreign key bug in handle renaming. All functionality remains fully backward compatible.

🏗️ Major Project Restructuring (v0.1.56 / v2.1.38): Four-phase structural overhaul — root files relocated to proper directories, documentation reorganized (32 flat files → 7 subdirectories), scripts reorganized (19 files → 6 subdirectories), Python tests restructured (28 files → 5 subdirectories), TypeScript engine implementations moved to storage/engines/ with barrel re-exports, factory pattern migration (SqliteNodeEngine.create()), test database cleanup, and .gitignore hardening. Runtime behavior is unchanged; all 849 TS tests and 767 Python tests pass.

Recent milestones also include DuckDB as an alternative storage engine, shared MIME registry (mime_extensions.json), PTR exception narrowing (109 broad catches → specific types), SqlJs vector adapter for browser-based vector search, and ContentTypeInterpreter event-loop starvation fix. See CHANGELOG.md for full details.

Testing

Note: All commands below should be run from the project root (MCard_TDD/).

Unit Tests

# Python
uv run pytest -q                 # Run all tests
uv run pytest -q -m "not slow"   # Fast tests only
uv run pytest -m "not network"   # Skip LLM/Ollama tests

# JavaScript
npm --prefix mcard-js test -- --run

# Browser (MCard Studio)
npm --prefix mcard-studio run test:unit -- --run

CLM Verification

Both Python and JavaScript CLM runners support three modes: all, directory, and single file.

Python

# Run all CLMs
uv run python scripts/clm/run_clms.py

# Run by directory
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic
uv run python scripts/clm/run_clms.py chapters/chapter_08_P2P

# Run single file
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic/addition.yaml

# Run with custom context
uv run python scripts/clm/run_clms.py chapters/chapter_08_P2P/generic_session.yaml \
    --context '{"sessionId": "my-session"}'

JavaScript

# Run all CLMs
npm --prefix mcard-js run clm:all

# Run by directory/filter
npm --prefix mcard-js run clm:all -- chapter_01_arithmetic
npm --prefix mcard-js run clm:all -- chapters/chapter_08_P2P

# Run single file
npm --prefix mcard-js run demo:clm -- chapters/chapter_01_arithmetic/addition_js.yaml

Chapter Directories

Directory Description
chapter_00_prologue Hello World, Lambda calculus, and Church encoding — 11 CLMs
chapter_01_arithmetic Arithmetic operations (Python, JS, Lean) — 27 CLMs
chapter_02_handle Handle operations and dual retrieval
chapter_03_llm LLM integration (requires Ollama)
chapter_04_load_dir Filesystem and collection loading
chapter_05_reflection Meta-programming and recursive CLMs
chapter_06_lambda Lambda calculus runtime
chapter_07_network HTTP requests, MCard sync, network I/O — 5 CLMs
chapter_08_P2P P2P networking and WebRTC — 16 CLMs (3 VCard)
chapter_09_DSL Meta-circular language definition and combinators — 10 CLMs
chapter_10_service Static server builtin and service management — 3 CLMs

Contributing

  1. Fork the repository and create a feature branch.
  2. Run the tests (uv run pytest, npm test in mcard-js).
  3. Submit a pull request describing your change and tests.

Future Roadmap

Road to VCard (Design & Implementation)

Based on the MVP Cards Design Rationale, a VCard (Value Card) represents a boundary-enforced value exchange unit that often contains sensitive privacy data (identities, private keys, financial claims). Unlike standard MCards which are designed for public distribution and reproducibility, VCards require strict confidentiality.

Design Requirements & Rationale:

  1. Privacy & Encryption: VCards cannot be stored in the standard mcard.db (which is often shared or public) without encryption. They must be stored in a "physically separate" container or be encrypted at rest.
  2. Authentication Primitive: A VCard serves as a specialized "Certificate of Authority" — a precondition for executing sensitive PTR actions.
  3. Audit Certificates: Execution of a VCard-authorized action must produce a VerificationVCard (Certificate of Execution), which proves the action occurred under authorization. This certificate is also sensitive.
  4. Unified Schema: While the storage location differs, the data schema should remain identical to MCard (content addressable, hash-linked) to reuse the rigorous polynomial logic.

Proposed Architecture:

  • Dual-Database Storage:
    • mcard.db (Public/Shared): Stores standard MCards, Logic (PCards), and Public Keys.
    • vcard.db (Private/Local): Stores VCards, Encrypted Private Keys, and Verification Certificates.
  • Execution Flow: execute(pcard_hash, input, vcard_authorization_hash)
    1. Gatekeeper: PTR checks if vcard_authorization_hash exists in the Private Store (vcard.db).
    2. Zero-Trust Verify: Runtime validates the VCard's cryptographic integrity and permissions (Security Polynomial).
    3. Execute: If valid, the PCard logic runs.
    4. Certify: A new VerificationVCard is generated, signed, and stored in vcard.db, linking the Input, Output, and Authority.

TODOs:

  • Infrastructure: Implement PrivateCollection (wrapper around vcard.db) in Python and JavaScript factories.
  • Encryption Middleware: Add a transparent encryption layer (e.g., AES-GCM) for the Private Collection to ensure Encryption-at-Rest.
  • CLI Auth: Update run_clms.py to accept --auth <vcard_hash> and mount the private keystore.
  • Certificate Generation: Implement the VerificationVCard schema and generation logic in CLMRunner.

Logical Model Certification & Functional Deployment

Use of the Cubical Logic Model (CLM) as a "Qualified Logical Model" is strictly governed by principles derived from Eelco Dolstra's The Purely Functional Software Deployment Model (the theoretical basis of Nix).

A CLM is not merely source code; it is a candidate for certification. It only becomes a Qualified Logical Model when it possesses a valid Certification, which is a cryptographic proof of successful execution by a specific version of the Polynomial Type Runtime (PTR).

The Functional Certification Equation:

$$ Observation = PTR_{vX.Y.Z}(CLM_{Source}) $$

$$ Certification = Sign_{Authority}(Hash(CLM_{Source}) + Hash(PTR_{vX.Y.Z}) + Hash(Observation)) $$

Parallels to the Nix Model:

  1. Hermetic Inputs: Just as a Nix derivation hashes all inputs (compiler, libs, source), a CLM Certification depends on the exact PTR Runtime Version and CLM Content Hash. Changing the runtime version invalidates the certificate, requiring re-qualification (re-execution).
  2. Deterministic Derivation: The "build" step is the execution of the CLM's verification logic. If the PTR (the builder) is deterministic, the output (VerificationVCard) is reproducible.
  3. The "Store": The mcard.db acts as the Nix Store, holding immutable, content-addressed CLMs. The vcard.db acts as the binary cache, holding signed Certifications (outputs) that prove a CLM works for a given runtime configuration.

This ensures that a "Qualified CLM" is not just "code that looks right," but "code that has logically proven itself" within a specific, physically identifiable execution environment.


License

This project is licensed under the MIT License – see LICENSE.

For release notes, check CHANGELOG.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcard-0.1.59.tar.gz (275.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcard-0.1.59-py3-none-any.whl (300.8 kB view details)

Uploaded Python 3

File details

Details for the file mcard-0.1.59.tar.gz.

File metadata

  • Download URL: mcard-0.1.59.tar.gz
  • Upload date:
  • Size: 275.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.2

File hashes

Hashes for mcard-0.1.59.tar.gz
Algorithm Hash digest
SHA256 086510e212325915c8d533b1fdf9a6ab3f73afd3d6b7a06fc05c141fbf3ec439
MD5 7f9aa21b58c6f8028563e55489f60fdd
BLAKE2b-256 772c9cc6d9c3103cc876a466d5ba06764640127cf6e57f3e605c1ee2880dbf82

See more details on using hashes here.

File details

Details for the file mcard-0.1.59-py3-none-any.whl.

File metadata

  • Download URL: mcard-0.1.59-py3-none-any.whl
  • Upload date:
  • Size: 300.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.2

File hashes

Hashes for mcard-0.1.59-py3-none-any.whl
Algorithm Hash digest
SHA256 8f3351f9e2e850bbffdbaca305abcb906c9bd1aef3bb0bfebe451bee2367e91b
MD5 d0e1c0994dbb8f55817d0caef199b5fb
BLAKE2b-256 c06ffaa31f1aa51e3d7e27429ce91c6c747ee3f266aa91ca4006b19f7db43e0b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page