Skip to main content

MCard: Local-first Content Addressable Storage with Content Type Detection

Project description

Python 3.10+ MIT License ruff Build Status

MCard

MCard is a local-first, content-addressable storage platform with cryptographic integrity, temporal ordering, and a Polynomial Type Runtime (PTR) that orchestrates polyglot execution. It gives teams a verifiable data backbone without sacrificing developer ergonomics or observability.


Highlights

  • ๐Ÿ” Hash-verifiable storage: Unified network of relationships via SHA-256 hashing across content, handles, and history.
  • ๐Ÿ†” Kenotic Identity System: Independent Soul Bound Token (SBT) issuance and VCard Gatekeeper authorization using strict decoupled IDENTITY_SPACE_PATH domains.
  • โ™พ๏ธ Universal Substrate: Emulates the Turing Machine "Infinitely Long Tape" via relational queries for computable DSLs.
  • โ™ป๏ธ Deterministic execution: PTR mediates 8 polyglot runtimes (Python, JavaScript, Rust, C, WASM, Lean, R, Julia).
  • ๐Ÿ“Š Enterprise ready: Structured logging, CI/CD pipeline, security auditing, 99%+ automated test coverage.
  • ๐Ÿง  AI-native extensions: GraphRAG engine, optional LLM runtime, and optimized multimodal vision (moondream).
  • โš›๏ธ Quantum NLP: Optional lambeq + PyTorch integration for pregroup grammar and quantum circuit compilation.
  • ๐Ÿงฐ Developer friendly: Rich Python API, TypeScript SDK, BMAD-driven TDD workflow, numerous examples.
  • ๐Ÿ“ Algorithm Benchmarks: Sine comparison (Taylor vs Chebyshev) across Python, C, and Rust.
  • โšก High Performance: Optimized test suite (~37s) with runtime caching and session-scoped fixtures.
  • ๐Ÿฆ† DuckDB Engine: Optional columnar OLAP storage backend โ€” same StorageEngine interface, ideal for analytical workloads and Parquet I/O.
  • ๐Ÿ“‹ Single Source of Truth Schema: Both SQLite and DuckDB engines load schema exclusively from canonical SQL files (mcard_schema.sql, mcard_vector_schema.sql) โ€” zero hardcoded CREATE TABLE statements.
  • ๐Ÿ”„ Shared MIME Registry: A single mime_extensions.json drives content-type detection across both Python and TypeScript โ€” edit one file to update both runtimes, no recompilation needed.
  • โฑ๏ธ Centralized Operational Constants: All timing literals (PTR timeouts, LLM defaults, RAG embedding delays, signaling server intervals) unified in config_constants.py with full TypeScript parity โ€” zero hard-coded timeouts remain in runtime code.

For the long-form narrative and chapter roadmap, see docs/theory/Narrative_Roadmap.md. Architectural philosophy is captured in docs/architecture/Monadic_Duality.md.


Quick Start (Python)

git clone https://github.com/xlp0/MCard_TDD.git
cd MCard_TDD
make setup-dev              # creates .venv with uv, installs all deps + pre-commit
make project-check          # validates Python, mcard-js, and mcard-studio
uv run pytest -q -m "not slow"  # run the fast Python test suite
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic/addition.yaml

Development Setup

This project uses uv as the sole Python dependency manager. All dependencies are defined in pyproject.toml and locked in uv.lock.

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Create venv and install all dependencies (including dev)
make setup-dev
# Or manually:
uv venv --prompt MCard_TDD
uv sync --all-extras --dev

# Run commands via uv
uv run pytest           # run tests
uv run ruff check mcard/  # lint
uv run python script.py   # run any script

Configuration

  • Copy example.env to .env for local development.
  • example.env is the canonical template; .env.example remains as a legacy-compatible copy.
  • The studio app uses mcard-studio/example.env for its own local .env seed.
  • Environment precedence is: CLI overrides โ†’ process environment โ†’ .env โ†’ example.env defaults.
  • Shared config defaults are defined in config/app_config.json and consumed by Python, TypeScript, and studio runtime helpers.

Create and retrieve a card:

from mcard import MCard, default_collection

card = MCard("Hello MCard")
hash_value = default_collection.add(card)
retrieved = default_collection.get(hash_value)
print(retrieved.get_content(as_text=True))

Quick Start (JavaScript / WASM)

See mcard-js/README.md for build, testing, and npm publishing instructions for the TypeScript implementation.

Quick Start (Quantum NLP)

MCard optionally integrates with lambeq for quantum natural language processing using pregroup grammar:

# Install with Quantum NLP support (requires Python 3.10+)
uv pip install -e ".[qnlp]"

# Parse a sentence into a pregroup grammar diagram
uv run python scripts/lambeq_web.py "John gave Mary a flower"

Example output (pregroup types):

John: n    gave: n.r @ s @ n.l @ n.l    Mary: n    a flower: n
Result: s (grammatically valid sentence)

The pregroup diagrams can be compiled to quantum circuits for QNLP experiments.


Polyglot Runtime Matrix

Runtime Status Notes
Python โœ… Reference implementation, CLM runner
JavaScript โœ… Node + browser (WASM) + Full RAG Support + Pyodide
Rust โœ… High-performance adapter & WASM target
C โœ… Low-level runtime integration
WASM โœ… Edge and sandbox execution
Lean โš™๏ธ Formal verification pipeline (requires lean-toolchain)
R โœ… Statistical computing runtime
Julia โœ… High-performance scientific computing

โš ๏ธ Lean Configuration: A lean-toolchain file in the project root is critical. Without it, elan will attempt to resolve/download toolchain metadata on every invocation, causing CLM execution to hang or become unbearably slow.

Compiling Native Binaries

For detailed instructions on compiling the required C, Rust, and WASM binaries for the polyglot tests, please see the Compiling Native Binaries Guide.


Project Structure (abridged)

MCard_TDD/
โ”œโ”€โ”€ mcard/            # Python package (engines, models, PTR)
โ”œโ”€โ”€ mcard-js/         # TypeScript SDK โ€” 3 interchangeable storage engines, PTR, RAG
โ”œโ”€โ”€ mcard-studio/     # [Submodule] Astro + React PWA โ€” artifact IDE with VCard events
โ”œโ”€โ”€ LandingPage/      # [Submodule] Static-first P2P Documentation & Landing Portal
โ”œโ”€โ”€ chapters/         # CLM specifications (polyglot demos)
โ”œโ”€โ”€ docs/             # Architecture, PRD, guides, reports
โ”œโ”€โ”€ scripts/          # Automation & demo scripts
โ”œโ”€โ”€ tests/            # >815 automated tests (Python)
โ”œโ”€โ”€ mime_extensions.json  # Shared MIME-type registry (Python + TypeScript)
โ””โ”€โ”€ pyproject.toml    # uv-managed dependencies (uv.lock)

Submodule Organization

This repository orchestrates two key frontend applications as submodules, each serving a distinct role in the MCard ecosystem:

1. mcard-studio (/mcard-studio)

The Interactive IDE for Eventual Consistency & Eventual Correctness. A Progressive Web App (PWA) built with Astro and React. It serves as the primary interface for creating, editing, and executing MCards and CLMs.

  • Tech Stack: Astro, React, Zustand, Monaco Editor, Anime.js/GSAP, Mermaid.
  • Key Features: Four-store persistence (servermemory.db, browsermemory.db, execution_logs.db, filesystem), dual-mode CLM execution (browser-first JS โ†’ server fallback), VCard event pipeline with result sealing, inline rename/upload/create, version history with time-travel, native AI assistant (Ollama), 30+ file type renderers.
  • Role: The "Editor" & "Runtime" environment for developers and power users.
  • Test Results: 372 tests passed (37 test files).

2. LandingPage (/LandingPage)

The Public Portal & Knowledge Container. A static-first modular web application designed for decentralized distribution. It focuses on P2P communication, documentation rendering, and interactive 3D visualizations.

  • Tech Stack: Vanilla JS Modules, WebRTC (No signaling server), Three.js, KaTeX, Mermaid, ZITADEL (OIDC Authentication).
  • Key Features: Serverless P2P mesh networking, zero-dependency architecture (runs locally without build steps), rich markdown/media rendering, and interactive authenticated pedagogical games (Monopoly, Chess, Go) with ZITADEL SSO integration.
  • Role: The "Viewer" & "distributable container" for the Personal Knowledge Container (PKC) concept.

Documentation

Platform Vision & Architecture

The theoretical foundations, including the Function Economy, Petri Net Scheduler, Dual-Handle Memory Architecture, and Concurrency Protection, have been consolidated into the Platform Vision Document.


Recent Updates

Full changelog: CHANGELOG.md

Current versions: Python mcard 0.1.64 ยท TypeScript mcard-js 2.1.47

PTR Security Hardening โ€” Bandit Remediation (March 31, 2026): Systematic remediation of all Medium and High severity findings from Bandit static analysis across 19 files. Introduced mcard/utils/url_safety.py โ€” a centralized safe_urlopen() helper that validates URL schemes (http/https only) before opening, replacing raw urllib.request.urlopen calls in 7 modules. Migrated XML parsing to defusedxml.ElementTree (with stdlib fallback) to prevent XXE/billion-laughs attacks. Eliminated shell=True subprocess calls in the signaling server by converting to safe array-based invocations. Fixed weak MD5 usage in network cache key generation with usedforsecurity=False. All 770 Python tests pass with 0 High / 0 Medium Bandit findings.

Config Unification, Canonical Env Templates, and Smoke Cleanup (March 28, 2026): Centralized shared configuration defaults in config/app_config.json with parity across Python, TypeScript, and studio runtime helpers; standardized the root example.env / legacy .env.example contract plus mcard-studio/example.env; documented environment precedence and CI seeding; and quieted boot-time ingest warnings during smoke runs while preserving the startup summary and the standalone websocket server path.

Binary File Size Limit Increase (v0.1.64 / v2.1.47): Raised the maximum binary file size from 50 MB to 150 MB across all runtimes (Python, TypeScript, Studio), enabling ingestion and preview of larger PDF, video, and other binary assets. Updated MAX_FILE_SIZE, MAX_ARTIFACT_SAVE_BYTES, MAX_BINARY_PREVIEW_BYTES, and problematic-file thresholds in 7 files.

Polyglot Runtime Stabilization & CLM Pipeline (v0.1.63 / v2.1.46): Stabilized the cross-environment execution pipeline for Cubical Logic Models (CLMs) across Python, Node.js, and Rust/WASM. Improved Python interpreter resolution in the JS SDK, added first-class Node.js runtime support, and implemented automatic server-side delegation for Node-dependent browser CLMs. Fixed gatekeeper evaluations and test assertions across all 13 prologue CLMs, achieving 100% test pass rates in both the CLI and Studio environments.

Pedagogical Game Integration via ZITADEL SSO: Added authenticated views for Chess, Go, and Monopoly to LandingPage, using ZITADEL OIDC to protect game state boundaries and user identity mapping as part of the broader ethnographic scaling research implementations.

Identity System Authentication Update (v0.1.62 / v2.1.45): Implemented a complete credential-based identity authentication system in mcard-studio, replacing the previous passwordless registration flow with server-side PBKDF2 password hashing and timing-safe login verification.

Polyglot Runtime & WASM Integration Fixes (v0.1.60 / v2.1.43): Fixed critical missing --experimental-wasm-modules environment flags in Python subprocess integration, rectified module:// protocol resolution for cross-environment testing, and enforced importlib dynamic module bootstrapping in NodeJS. Patched Rust/TypeScript content-type string detection parity bugs and Python 3.9 __future__ typing incompatibilities.

Python Build & Syntax Corrections (v0.1.59 / v2.1.42): Fixed mcard syntax, namespace, and import dependency compilation errors across improved_logging.py, card_collection.py, and logging_config.py. Restored 100% test build health.

Storage Layer Deduplication (v0.1.58 / v2.1.42): Comprehensive refactoring to centralize card operations into AbstractSqlEngine (TypeScript) and simplify connection paths via resolve_db_path() (Python). Eliminated over 250 lines of duplicate code across the 4 TypeScript SQL engines, standardized dialect handling (SQLite TEXT vs DuckDB VARCHAR), and fixed a latent foreign key bug in handle renaming. All functionality remains fully backward compatible.

๐Ÿ—๏ธ Major Project Restructuring (v0.1.56 / v2.1.38): Four-phase structural overhaul โ€” root files relocated to proper directories, documentation reorganized (32 flat files โ†’ 7 subdirectories), scripts reorganized (19 files โ†’ 6 subdirectories), Python tests restructured (28 files โ†’ 5 subdirectories), TypeScript engine implementations moved to storage/engines/ with barrel re-exports, factory pattern migration (SqliteNodeEngine.create()), test database cleanup, and .gitignore hardening. Runtime behavior is unchanged; all 849 TS tests and 767 Python tests pass.

Recent milestones also include DuckDB as an alternative storage engine, shared MIME registry (mime_extensions.json), PTR exception narrowing (109 broad catches โ†’ specific types), SqlJs vector adapter for browser-based vector search, and ContentTypeInterpreter event-loop starvation fix. See CHANGELOG.md for full details.

Testing

Note: All commands below should be run from the project root (MCard_TDD/).

Unit Tests

# Python
uv run pytest -q                 # Run all tests
uv run pytest -q -m "not slow"   # Fast tests only
uv run pytest -m "not network"   # Skip LLM/Ollama tests

# JavaScript
npm --prefix mcard-js test -- --run

# Browser (MCard Studio)
npm --prefix mcard-studio run test:unit -- --run

CLM Verification

Both Python and JavaScript CLM runners support three modes: all, directory, and single file.

Python

# Run all CLMs
uv run python scripts/clm/run_clms.py

# Run by directory
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic
uv run python scripts/clm/run_clms.py chapters/chapter_08_P2P

# Run single file
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic/addition.yaml

# Run with custom context
uv run python scripts/clm/run_clms.py chapters/chapter_08_P2P/generic_session.yaml \
    --context '{"sessionId": "my-session"}'

JavaScript

# Run all CLMs
npm --prefix mcard-js run clm:all

# Run by directory/filter
npm --prefix mcard-js run clm:all -- chapter_01_arithmetic
npm --prefix mcard-js run clm:all -- chapters/chapter_08_P2P

# Run single file
npm --prefix mcard-js run demo:clm -- chapters/chapter_01_arithmetic/addition_js.yaml

Chapter Directories

Directory Description
chapter_00_prologue Hello World, Lambda calculus, identity basics, and Church encoding โ€” 13 CLMs
chapter_01_arithmetic Arithmetic operations (Python, JS, Lean) โ€” 27 CLMs
chapter_02_handle Handle operations and dual retrieval
chapter_03_llm LLM integration (requires Ollama)
chapter_04_load_dir Filesystem and collection loading
chapter_05_reflection Meta-programming and recursive CLMs
chapter_06_lambda Lambda calculus runtime
chapter_07_network HTTP requests, MCard sync, network I/O โ€” 5 CLMs
chapter_08_P2P P2P networking and WebRTC โ€” 16 CLMs (3 VCard)
chapter_09_DSL Meta-circular language definition and combinators โ€” 10 CLMs
chapter_10_service Static server builtin and service management โ€” 3 CLMs

Contributing

  1. Fork the repository and create a feature branch.
  2. Read CONTRIBUTING.md for the normalized workflow and project-wide validation commands.
  3. Submit a pull request describing your change and tests.

Future Roadmap

Road to VCard (Design & Implementation)

Based on the MVP Cards Design Rationale, a VCard (Value Card) represents a boundary-enforced value exchange unit that often contains sensitive privacy data (identities, private keys, financial claims). Unlike standard MCards which are designed for public distribution and reproducibility, VCards require strict confidentiality.

Design Requirements & Rationale:

  1. Privacy & Encryption: VCards cannot be stored in the standard mcard.db (which is often shared or public) without encryption. They must be stored in a "physically separate" container or be encrypted at rest.
  2. Authentication Primitive: A VCard serves as a specialized "Certificate of Authority" โ€” a precondition for executing sensitive PTR actions.
  3. Audit Certificates: Execution of a VCard-authorized action must produce a VerificationVCard (Certificate of Execution), which proves the action occurred under authorization. This certificate is also sensitive.
  4. Unified Schema: While the storage location differs, the data schema should remain identical to MCard (content addressable, hash-linked) to reuse the rigorous polynomial logic.

Proposed Architecture:

  • Dual-Database Storage:
    • mcard.db (Public/Shared): Stores standard MCards, Logic (PCards), and Public Keys.
    • vcard.db (Private/Local): Stores VCards, Encrypted Private Keys, and Verification Certificates.
  • Execution Flow: execute(pcard_hash, input, vcard_authorization_hash)
    1. Gatekeeper: PTR checks if vcard_authorization_hash exists in the Private Store (vcard.db).
    2. Zero-Trust Verify: Runtime validates the VCard's cryptographic integrity and permissions (Security Polynomial).
    3. Execute: If valid, the PCard logic runs.
    4. Certify: A new VerificationVCard is generated, signed, and stored in vcard.db, linking the Input, Output, and Authority.

TODOs:

  • Infrastructure: Implement PrivateCollection (wrapper around vcard.db) in Python and JavaScript factories.
  • Encryption Middleware: Add a transparent encryption layer (e.g., AES-GCM) for the Private Collection to ensure Encryption-at-Rest.
  • CLI Auth: Update run_clms.py to accept --auth <vcard_hash> and mount the private keystore.
  • Certificate Generation: Implement the VerificationVCard schema and generation logic in CLMRunner.

Logical Model Certification & Functional Deployment

Use of the Cubical Logic Model (CLM) as a "Qualified Logical Model" is strictly governed by principles derived from Eelco Dolstra's The Purely Functional Software Deployment Model (the theoretical basis of Nix).

A CLM is not merely source code; it is a candidate for certification. It only becomes a Qualified Logical Model when it possesses a valid Certification, which is a cryptographic proof of successful execution by a specific version of the Polynomial Type Runtime (PTR).

The Functional Certification Equation:

$$ Observation = PTR_{vX.Y.Z}(CLM_{Source}) $$

$$ Certification = Sign_{Authority}(Hash(CLM_{Source}) + Hash(PTR_{vX.Y.Z}) + Hash(Observation)) $$

Parallels to the Nix Model:

  1. Hermetic Inputs: Just as a Nix derivation hashes all inputs (compiler, libs, source), a CLM Certification depends on the exact PTR Runtime Version and CLM Content Hash. Changing the runtime version invalidates the certificate, requiring re-qualification (re-execution).
  2. Deterministic Derivation: The "build" step is the execution of the CLM's verification logic. If the PTR (the builder) is deterministic, the output (VerificationVCard) is reproducible.
  3. The "Store": The mcard.db acts as the Nix Store, holding immutable, content-addressed CLMs. The vcard.db acts as the binary cache, holding signed Certifications (outputs) that prove a CLM works for a given runtime configuration.

This ensures that a "Qualified CLM" is not just "code that looks right," but "code that has logically proven itself" within a specific, physically identifiable execution environment.


License

This project is licensed under the MIT License โ€“ see LICENSE.

For release notes, check CHANGELOG.md.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mcard-0.1.64.tar.gz (282.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mcard-0.1.64-py3-none-any.whl (306.9 kB view details)

Uploaded Python 3

File details

Details for the file mcard-0.1.64.tar.gz.

File metadata

  • Download URL: mcard-0.1.64.tar.gz
  • Upload date:
  • Size: 282.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.2

File hashes

Hashes for mcard-0.1.64.tar.gz
Algorithm Hash digest
SHA256 f532c04015220e5d286718264fded36f575e3e94aaaf25b94ef726fc46ba29d1
MD5 75c4f20414a186bbe54092be5f488e09
BLAKE2b-256 d7b6f5b0c8466267e6b07d4f578bbbe3ab996338633848af529cb37789b09203

See more details on using hashes here.

File details

Details for the file mcard-0.1.64-py3-none-any.whl.

File metadata

  • Download URL: mcard-0.1.64-py3-none-any.whl
  • Upload date:
  • Size: 306.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.2

File hashes

Hashes for mcard-0.1.64-py3-none-any.whl
Algorithm Hash digest
SHA256 b426c62b6b180c7254b35bffcca9b216bb3b046bc6c44b83e3712e9691a2895d
MD5 5fbb7a33fced8180865a9acb9ed720b7
BLAKE2b-256 bde88afec13f569cfe35ef33e4be99f2a077be3e2af8ffa1319d72535d0ceab8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page