MCard: Local-first Content Addressable Storage with Content Type Detection
Project description
MCard
MCard is a local-first, content-addressable storage platform with cryptographic integrity, temporal ordering, and a Polynomial Type Runtime (PTR) that orchestrates polyglot execution. It gives teams a verifiable data backbone without sacrificing developer ergonomics or observability.
Highlights
- ๐ Hash-verifiable storage: Unified network of relationships via SHA-256 hashing across content, handles, and history.
- ๐ Kenotic Identity System: Independent Soul Bound Token (SBT) issuance and VCard Gatekeeper authorization using strict decoupled
IDENTITY_SPACE_PATHdomains. - โพ๏ธ Universal Substrate: Emulates the Turing Machine "Infinitely Long Tape" via relational queries for computable DSLs.
- โป๏ธ Deterministic execution: PTR mediates 8 polyglot runtimes (Python, JavaScript, Rust, C, WASM, Lean, R, Julia).
- ๐ Enterprise ready: Structured logging, CI/CD pipeline, security auditing, 99%+ automated test coverage.
- ๐ง AI-native extensions: GraphRAG engine, optional LLM runtime, and optimized multimodal vision (
moondream). - โ๏ธ Quantum NLP: Optional
lambeq+ PyTorch integration for pregroup grammar and quantum circuit compilation. - ๐งฐ Developer friendly: Rich Python API, TypeScript SDK, BMAD-driven TDD workflow, numerous examples.
- ๐ Algorithm Benchmarks: Sine comparison (Taylor vs Chebyshev) across Python, C, and Rust.
- โก High Performance: Optimized test suite (~37s) with runtime caching and session-scoped fixtures.
- ๐ฆ DuckDB Engine: Optional columnar OLAP storage backend โ same
StorageEngineinterface, ideal for analytical workloads and Parquet I/O. - ๐ Single Source of Truth Schema: Both SQLite and DuckDB engines load schema exclusively from canonical SQL files (
mcard_schema.sql,mcard_vector_schema.sql) โ zero hardcoded CREATE TABLE statements. - ๐ Shared MIME Registry: A single
mime_extensions.jsondrives content-type detection across both Python and TypeScript โ edit one file to update both runtimes, no recompilation needed. - โฑ๏ธ Centralized Operational Constants: All timing literals (PTR timeouts, LLM defaults, RAG embedding delays, signaling server intervals) unified in
config_constants.pywith full TypeScript parity โ zero hard-coded timeouts remain in runtime code.
For the long-form narrative and chapter roadmap, see docs/theory/Narrative_Roadmap.md. Architectural philosophy is captured in docs/architecture/Monadic_Duality.md.
Quick Start (Python)
git clone https://github.com/xlp0/MCard_TDD.git
cd MCard_TDD
make setup-dev # creates .venv with uv, installs all deps + pre-commit
make project-check # validates Python, mcard-js, and mcard-studio
uv run pytest -q -m "not slow" # run the fast Python test suite
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic/addition.yaml
Development Setup
This project uses uv as the sole Python dependency manager. All dependencies are defined in pyproject.toml and locked in uv.lock.
# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create venv and install all dependencies (including dev)
make setup-dev
# Or manually:
uv venv --prompt MCard_TDD
uv sync --all-extras --dev
# Run commands via uv
uv run pytest # run tests
uv run ruff check mcard/ # lint
uv run python script.py # run any script
Configuration
- Copy
example.envto.envfor local development. example.envis the canonical template;.env.exampleremains as a legacy-compatible copy.- The studio app uses
mcard-studio/example.envfor its own local.envseed. - Environment precedence is: CLI overrides โ process environment โ
.envโexample.envdefaults. - Shared config defaults are defined in
config/app_config.jsonand consumed by Python, TypeScript, and studio runtime helpers.
Create and retrieve a card:
from mcard import MCard, default_collection
card = MCard("Hello MCard")
hash_value = default_collection.add(card)
retrieved = default_collection.get(hash_value)
print(retrieved.get_content(as_text=True))
Quick Start (JavaScript / WASM)
See mcard-js/README.md for build, testing, and npm publishing instructions for the TypeScript implementation.
- mcard-studio: The interactive PWA IDE โ see mcard-studio/README.md for setup and architecture.
Quick Start (Quantum NLP)
MCard optionally integrates with lambeq for quantum natural language processing using pregroup grammar:
# Install with Quantum NLP support (requires Python 3.10+)
uv pip install -e ".[qnlp]"
# Parse a sentence into a pregroup grammar diagram
uv run python scripts/lambeq_web.py "John gave Mary a flower"
Example output (pregroup types):
John: n gave: n.r @ s @ n.l @ n.l Mary: n a flower: n
Result: s (grammatically valid sentence)
The pregroup diagrams can be compiled to quantum circuits for QNLP experiments.
Polyglot Runtime Matrix
| Runtime | Status | Notes |
|---|---|---|
| Python | โ | Reference implementation, CLM runner |
| JavaScript | โ | Node + browser (WASM) + Full RAG Support + Pyodide |
| Rust | โ | High-performance adapter & WASM target |
| C | โ | Low-level runtime integration |
| WASM | โ | Edge and sandbox execution |
| Lean | โ๏ธ | Formal verification pipeline (requires lean-toolchain) |
| R | โ | Statistical computing runtime |
| Julia | โ | High-performance scientific computing |
โ ๏ธ Lean Configuration: A
lean-toolchainfile in the project root is critical. Without it,elanwill attempt to resolve/download toolchain metadata on every invocation, causing CLM execution to hang or become unbearably slow.
Compiling Native Binaries
For detailed instructions on compiling the required C, Rust, and WASM binaries for the polyglot tests, please see the Compiling Native Binaries Guide.
Project Structure (abridged)
MCard_TDD/
โโโ mcard/ # Python package (engines, models, PTR)
โโโ mcard-js/ # TypeScript SDK โ 3 interchangeable storage engines, PTR, RAG
โโโ mcard-studio/ # [Submodule] Astro + React PWA โ artifact IDE with VCard events
โโโ LandingPage/ # [Submodule] Static-first P2P Documentation & Landing Portal
โโโ chapters/ # CLM specifications (polyglot demos)
โโโ docs/ # Architecture, PRD, guides, reports
โโโ scripts/ # Automation & demo scripts
โโโ tests/ # >815 automated tests (Python)
โโโ mime_extensions.json # Shared MIME-type registry (Python + TypeScript)
โโโ pyproject.toml # uv-managed dependencies (uv.lock)
Submodule Organization
This repository orchestrates two key frontend applications as submodules, each serving a distinct role in the MCard ecosystem:
1. mcard-studio (/mcard-studio)
The Interactive IDE for Eventual Consistency & Eventual Correctness. A Progressive Web App (PWA) built with Astro and React. It serves as the primary interface for creating, editing, and executing MCards and CLMs.
- Tech Stack: Astro, React, Zustand, Monaco Editor, Anime.js/GSAP, Mermaid.
- Key Features: Four-store persistence (
servermemory.db,browsermemory.db,execution_logs.db, filesystem), dual-mode CLM execution (browser-first JS โ server fallback), VCard event pipeline with result sealing, inline rename/upload/create, version history with time-travel, native AI assistant (Ollama), 30+ file type renderers. - Role: The "Editor" & "Runtime" environment for developers and power users.
- Test Results: 372 tests passed (37 test files).
2. LandingPage (/LandingPage)
The Public Portal & Knowledge Container. A static-first modular web application designed for decentralized distribution. It focuses on P2P communication, documentation rendering, and interactive 3D visualizations.
- Tech Stack: Vanilla JS Modules, WebRTC (No signaling server), Three.js, KaTeX, Mermaid, ZITADEL (OIDC Authentication).
- Key Features: Serverless P2P mesh networking, zero-dependency architecture (runs locally without build steps), rich markdown/media rendering, and interactive authenticated pedagogical games (Monopoly, Chess, Go) with ZITADEL SSO integration.
- Role: The "Viewer" & "distributable container" for the Personal Knowledge Container (PKC) concept.
Documentation
- Product requirements: docs/specifications/prd.md
- Architecture overview: docs/architecture/overview.md
- Schema principles: schema/README.md โ Empty Schema grounding, verification-first storage, and the core/extension split.
mcard-jsschema reference: mcard-js/schema/README.md โ Practical explanation ofmcard_schema.sqlandmcard_vector_schema.sql.- DOTS vocabulary: docs/WorkingNotes/Hub/Theory/Integration/DOTS Vocabulary as Efficient Representation for ABC Curriculum.md
- MonadโPolynomial philosophy: docs/architecture/Monadic_Duality.md
- Narrative roadmap & chapters: docs/theory/Narrative_Roadmap.md
- Logging system: docs/guides/LOGGING_GUIDE.md
- PTR & CLM reference: docs/specifications/CLM_Language_Specification.md, docs/archive/PCard Architecture.md
- Reports & execution summaries: docs/reports/
- WebSocket Performance Debugging
- Petri Net Implementation โ Physical implementation mapping
Platform Vision & Architecture
The theoretical foundations, including the Function Economy, Petri Net Scheduler, Dual-Handle Memory Architecture, and Concurrency Protection, have been consolidated into the Platform Vision Document.
- DOTS โ PTR Meta-Language โ Theoretical framework
Recent Updates
Full changelog: CHANGELOG.md
Current versions: Python mcard 0.1.65 ยท TypeScript mcard-js 2.1.48
Autonomous Mesh Discovery & Identity Validation (April 02, 2026): Fully implemented mDNS "Friendly Network" peer discovery in both the Python and JavaScript runtimes. The single-boot launch command now natively spins up the node, API, Studio, and zero-conf sniffer all in one process. GTime triplet execution traces now correctly extract and cryptographically bind the authentic user identity (did:key:xxxx) pulled dynamically from the Studio UI's native IdentityStore rather than using environment variables. All parity tasks completed.
PTR Security Hardening โ Bandit Remediation (March 31, 2026): Systematic remediation of all Medium and High severity findings from Bandit static analysis across 19 files. Introduced mcard/utils/url_safety.py โ a centralized safe_urlopen() helper that validates URL schemes (http/https only) before opening, replacing raw urllib.request.urlopen calls in 7 modules. Migrated XML parsing to defusedxml.ElementTree (with stdlib fallback) to prevent XXE/billion-laughs attacks. Eliminated shell=True subprocess calls in the signaling server by converting to safe array-based invocations. Fixed weak MD5 usage in network cache key generation with usedforsecurity=False. All 770 Python tests pass with 0 High / 0 Medium Bandit findings.
Config Unification, Canonical Env Templates, and Smoke Cleanup (March 28, 2026): Centralized shared configuration defaults in config/app_config.json with parity across Python, TypeScript, and studio runtime helpers; standardized the root example.env / legacy .env.example contract plus mcard-studio/example.env; documented environment precedence and CI seeding; and quieted boot-time ingest warnings during smoke runs while preserving the startup summary and the standalone websocket server path.
Binary File Size Limit Increase (v0.1.64 / v2.1.47): Raised the maximum binary file size from 50 MB to 150 MB across all runtimes (Python, TypeScript, Studio), enabling ingestion and preview of larger PDF, video, and other binary assets. Updated MAX_FILE_SIZE, MAX_ARTIFACT_SAVE_BYTES, MAX_BINARY_PREVIEW_BYTES, and problematic-file thresholds in 7 files.
Polyglot Runtime Stabilization & CLM Pipeline (v0.1.63 / v2.1.46): Stabilized the cross-environment execution pipeline for Cubical Logic Models (CLMs) across Python, Node.js, and Rust/WASM. Improved Python interpreter resolution in the JS SDK, added first-class Node.js runtime support, and implemented automatic server-side delegation for Node-dependent browser CLMs. Fixed gatekeeper evaluations and test assertions across all 13 prologue CLMs, achieving 100% test pass rates in both the CLI and Studio environments.
Pedagogical Game Integration via ZITADEL SSO: Added authenticated views for Chess, Go, and Monopoly to LandingPage, using ZITADEL OIDC to protect game state boundaries and user identity mapping as part of the broader ethnographic scaling research implementations.
Identity System Authentication Update (v0.1.62 / v2.1.45): Implemented a complete credential-based identity authentication system in mcard-studio, replacing the previous passwordless registration flow with server-side PBKDF2 password hashing and timing-safe login verification.
Polyglot Runtime & WASM Integration Fixes (v0.1.60 / v2.1.43): Fixed critical missing --experimental-wasm-modules environment flags in Python subprocess integration, rectified module:// protocol resolution for cross-environment testing, and enforced importlib dynamic module bootstrapping in NodeJS. Patched Rust/TypeScript content-type string detection parity bugs and Python 3.9 __future__ typing incompatibilities.
Python Build & Syntax Corrections (v0.1.59 / v2.1.42): Fixed mcard syntax, namespace, and import dependency compilation errors across improved_logging.py, card_collection.py, and logging_config.py. Restored 100% test build health.
Storage Layer Deduplication (v0.1.58 / v2.1.42): Comprehensive refactoring to centralize card operations into AbstractSqlEngine (TypeScript) and simplify connection paths via resolve_db_path() (Python). Eliminated over 250 lines of duplicate code across the 4 TypeScript SQL engines, standardized dialect handling (SQLite TEXT vs DuckDB VARCHAR), and fixed a latent foreign key bug in handle renaming. All functionality remains fully backward compatible.
๐๏ธ Major Project Restructuring (v0.1.56 / v2.1.38): Four-phase structural overhaul โ root files relocated to proper directories, documentation reorganized (32 flat files โ 7 subdirectories), scripts reorganized (19 files โ 6 subdirectories), Python tests restructured (28 files โ 5 subdirectories), TypeScript engine implementations moved to storage/engines/ with barrel re-exports, factory pattern migration (SqliteNodeEngine.create()), test database cleanup, and .gitignore hardening. Runtime behavior is unchanged; all 849 TS tests and 767 Python tests pass.
Recent milestones also include DuckDB as an alternative storage engine, shared MIME registry (mime_extensions.json), PTR exception narrowing (109 broad catches โ specific types), SqlJs vector adapter for browser-based vector search, and ContentTypeInterpreter event-loop starvation fix. See CHANGELOG.md for full details.
Testing
Note: All commands below should be run from the project root (
MCard_TDD/).
Unit Tests
# Python
uv run pytest -q # Run all tests
uv run pytest -q -m "not slow" # Fast tests only
uv run pytest -m "not network" # Skip LLM/Ollama tests
# JavaScript
npm --prefix mcard-js test -- --run
# Browser (MCard Studio)
npm --prefix mcard-studio run test:unit -- --run
CLM Verification
Both Python and JavaScript CLM runners support three modes: all, directory, and single file.
Python
# Run all CLMs
uv run python scripts/clm/run_clms.py
# Run by directory
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic
uv run python scripts/clm/run_clms.py chapters/chapter_08_P2P
# Run single file
uv run python scripts/clm/run_clms.py chapters/chapter_01_arithmetic/addition.yaml
# Run with custom context
uv run python scripts/clm/run_clms.py chapters/chapter_08_P2P/generic_session.yaml \
--context '{"sessionId": "my-session"}'
JavaScript
# Run all CLMs
npm --prefix mcard-js run clm:all
# Run by directory/filter
npm --prefix mcard-js run clm:all -- chapter_01_arithmetic
npm --prefix mcard-js run clm:all -- chapters/chapter_08_P2P
# Run single file
npm --prefix mcard-js run demo:clm -- chapters/chapter_01_arithmetic/addition_js.yaml
Chapter Directories
| Directory | Description |
|---|---|
chapter_00_prologue |
Hello World, Lambda calculus, identity basics, and Church encoding โ 13 CLMs |
chapter_01_arithmetic |
Arithmetic operations (Python, JS, Lean) โ 27 CLMs |
chapter_02_handle |
Handle operations and dual retrieval |
chapter_03_llm |
LLM integration (requires Ollama) |
chapter_04_load_dir |
Filesystem and collection loading |
chapter_05_reflection |
Meta-programming and recursive CLMs |
chapter_06_lambda |
Lambda calculus runtime |
chapter_07_network |
HTTP requests, MCard sync, network I/O โ 5 CLMs |
chapter_08_P2P |
P2P networking and WebRTC โ 16 CLMs (3 VCard) |
chapter_09_DSL |
Meta-circular language definition and combinators โ 10 CLMs |
chapter_10_service |
Static server builtin and service management โ 3 CLMs |
Contributing
- Fork the repository and create a feature branch.
- Read CONTRIBUTING.md for the normalized workflow and project-wide validation commands.
- Submit a pull request describing your change and tests.
Future Roadmap
Road to VCard (Design & Implementation)
Based on the MVP Cards Design Rationale, a VCard (Value Card) represents a boundary-enforced value exchange unit that often contains sensitive privacy data (identities, private keys, financial claims). Unlike standard MCards which are designed for public distribution and reproducibility, VCards require strict confidentiality.
Design Requirements & Rationale:
- Privacy & Encryption: VCards cannot be stored in the standard
mcard.db(which is often shared or public) without encryption. They must be stored in a "physically separate" container or be encrypted at rest. - Authentication Primitive: A VCard serves as a specialized "Certificate of Authority" โ a precondition for executing sensitive PTR actions.
- Audit Certificates: Execution of a VCard-authorized action must produce a VerificationVCard (Certificate of Execution), which proves the action occurred under authorization. This certificate is also sensitive.
- Unified Schema: While the storage location differs, the data schema should remain identical to MCard (content addressable, hash-linked) to reuse the rigorous polynomial logic.
Proposed Architecture:
- Dual-Database Storage:
mcard.db(Public/Shared): Stores standard MCards, Logic (PCards), and Public Keys.vcard.db(Private/Local): Stores VCards, Encrypted Private Keys, and Verification Certificates.
- Execution Flow:
execute(pcard_hash, input, vcard_authorization_hash)- Gatekeeper: PTR checks if
vcard_authorization_hashexists in the Private Store (vcard.db). - Zero-Trust Verify: Runtime validates the VCard's cryptographic integrity and permissions (Security Polynomial).
- Execute: If valid, the PCard logic runs.
- Certify: A new
VerificationVCardis generated, signed, and stored invcard.db, linking the Input, Output, and Authority.
- Gatekeeper: PTR checks if
TODOs:
- Infrastructure: Implement
PrivateCollection(wrapper aroundvcard.db) in Python and JavaScript factories. - Encryption Middleware: Add a transparent encryption layer (e.g., AES-GCM) for the Private Collection to ensure Encryption-at-Rest.
- CLI Auth: Update
run_clms.pyto accept--auth <vcard_hash>and mount the private keystore. - Certificate Generation: Implement the
VerificationVCardschema and generation logic inCLMRunner.
Logical Model Certification & Functional Deployment
Use of the Cubical Logic Model (CLM) as a "Qualified Logical Model" is strictly governed by principles derived from Eelco Dolstra's The Purely Functional Software Deployment Model (the theoretical basis of Nix).
A CLM is not merely source code; it is a candidate for certification. It only becomes a Qualified Logical Model when it possesses a valid Certification, which is a cryptographic proof of successful execution by a specific version of the Polynomial Type Runtime (PTR).
The Functional Certification Equation:
$$ Observation = PTR_{vX.Y.Z}(CLM_{Source}) $$
$$ Certification = Sign_{Authority}(Hash(CLM_{Source}) + Hash(PTR_{vX.Y.Z}) + Hash(Observation)) $$
Parallels to the Nix Model:
- Hermetic Inputs: Just as a Nix derivation hashes all inputs (compiler, libs, source), a CLM Certification depends on the exact PTR Runtime Version and CLM Content Hash. Changing the runtime version invalidates the certificate, requiring re-qualification (re-execution).
- Deterministic Derivation: The "build" step is the execution of the CLM's verification logic. If the PTR (the builder) is deterministic, the output (VerificationVCard) is reproducible.
- The "Store": The
mcard.dbacts as the Nix Store, holding immutable, content-addressed CLMs. Thevcard.dbacts as the binary cache, holding signed Certifications (outputs) that prove a CLM works for a given runtime configuration.
This ensures that a "Qualified CLM" is not just "code that looks right," but "code that has logically proven itself" within a specific, physically identifiable execution environment.
License
This project is licensed under the MIT License โ see LICENSE.
For release notes, check CHANGELOG.md.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mcard-0.1.65.tar.gz.
File metadata
- Download URL: mcard-0.1.65.tar.gz
- Upload date:
- Size: 287.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
eb66f0a0b260714e7c34667f24b8a0f9f328a6925ee326cac56a5709e07b91c8
|
|
| MD5 |
234cbf09760db4172161e59ed1f135ae
|
|
| BLAKE2b-256 |
4f2c8fea6c688fc65694ac0db2437c060c1ce1539e9ba167698f0aecb880410d
|
File details
Details for the file mcard-0.1.65-py3-none-any.whl.
File metadata
- Download URL: mcard-0.1.65-py3-none-any.whl
- Upload date:
- Size: 312.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
34e2036cab2a0a362af69bf7f7b4f6b9e3d8950e9a48ae73e494e9c2bfa881c8
|
|
| MD5 |
39e05e4fa79a438796e0b03206c7fb95
|
|
| BLAKE2b-256 |
8570c5955bb1c6f505e93f6e4c6cd1b837fa0e887ed56dcc611278b536b41bf2
|