Skip to main content

Universal model assimilation into a sparse spiking neural state built on crdt-merge.

Project description

The Borg Project

Universal model assimilation into a sparse, brain-like spiking neural superintelligence, built on crdt-merge.

Any existing model (RoBERTa, DeepSeek-Coder, Phi-3, a ViT, or another SNN variant) can be converted into a sparse spiking shard and merged into a single growing state. Every merge is mathematically conflict-free, intelligence is preserved, and the resulting model is event-driven, quantisable to INT8, and runnable on commodity CPUs or phones.

Three pillars plus the collective layer

  1. Universal absorption. Training-free ANN-to-SNN conversion turns any model checkpoint into a compatible sparse contribution. No fine-tuning.
  2. Spike-preserving merge. A sparse-delta adapter on top of crdt-merge's OR-Set, Merkle tree, provenance log, and E4 trust lattice treats individual weight deltas, timing traces, and activation events as discrete items, so naive averaging never blurs spike timings into oblivion.
  3. Predictive refinement. An optional variational free-energy step after CRDT resolve tunes thresholds and timings to minimise prediction error, mirroring biological active inference.

Plus the collective gossip layer: every clone is a full peer on a Merkle-DAG of signed contribution envelopes. Pairwise gossip converges the global state with no central master, no rate limits, and automatic deduplication of redundant absorptions. See docs/collective.md and borg.collective.

Plus the capability registry: every absorbed model is tagged with its head type (causal_lm, masked_lm, embedder, classifier, unknown) and persisted on the state. At query time, borg.decode.universal_decode dispatches to the correct decoder path automatically -- so the Borg produces a readable answer whether you've absorbed a causal LM, a sentence embedder, a classifier, or any combination. The manifest is cumulative across absorbs: the diagnostic panel shows every model ever absorbed, not just the latest round.

Layout

the-borg-project/
  docs/
    vision.md, architecture.md, plain-english.md
    roadmap.md, sprint-plan.md, waterfall.md, status-matrix.md
    glossary.md, world-first.md, brain.md, convergence-path.md
    neurocrdt-principles.md, collective.md
    specs/                          per-pillar technical specs
    adrs/                           architecture decision records
  src/borg/
    assimilation.py                 top-level assimilate-and-merge entry
    convert.py                      ANN-to-SNN conversion (feedforward + MBE)
    sparse.py                       sparse-delta envelope with timing trace
    merge.py                        merge orchestration + probe gate + cumulative manifest
    e4.py                           crdt_merge.e4 trust-lattice wiring
    fep.py                          variational free-energy refinement
    calibration.py                  post-merge per-vocab logit rescaling
    inference.py                    event-driven sparse SNN forward pass
    decode.py                       token-level + universal (capability-agnostic) decoder
    heads.py                        capability registry (causal_lm / masked_lm / embedder / classifier / unknown)
    bench.py                        fidelity benchmarks (MBE vs reference)
    rag/                            document ingest and retrieval
    speech.py                       microphone transcription adapter
    app.py                          Gradio demo with chain-of-thought prompt
    worker.py                       FastAPI worker + Ed25519 envelopes
    collective/                     P2P gossip layer (CID, Bloom, Peer, sync)
  proto/                            JSON schemas for contribution envelopes
  examples/end-to-end/              local simulation scripts + HF Space config
  examples/colab/                   Colab notebooks for Nord + continuous absorb
  scripts/                          hooks, audit, benchmarks, Space wipe
  tests/                            unit + regression tests
  .github/workflows/                CI (lint, type check, tests, audit, benches)

Status

Private, pre-publication. The scaffold is real code backed by tests. Every closed gap is documented in docs/status-matrix.md. See TASKS.md for the phased plan, docs/roadmap.md for the delivery calendar, and docs/sprint-plan.md for the two-week sprint breakdown.

CI runs ruff, mypy, and pytest on Python 3.10, 3.11, and 3.12, plus a dedicated forensic audit and JSON-schema validation.

Install

pip install the-borg-project
# or with optional extras
pip install "the-borg-project[rag,speech,app,convert,worker]"

Quick start (from source for development)

python -m venv .venv
source .venv/bin/activate
pip install -e .[dev]
bash scripts/install-hooks.sh
bash scripts/run-tests.sh

Run the demo

python -m borg.app

Opens a local Gradio UI with chat, document upload (RAG), and microphone input. The chat dispatches through borg.decode.universal_decode, which routes per registered head type (causal LM → token decode; masked LM → mask-fill; classifier → class activations; embedder → RAG-grounded answer; unknown → three-step CoT report with a latent signature). The diagnostic panel at the top of the UI shows every absorbed model with its head type, layer count, parameter count, and per-model sparsity.

Assimilate a model

from borg.assimilation import assimilate

assimilate(
    model_ids=["FacebookAI/roberta-base"],
    borg_path="borg_snn.pt",
)

Every call merges the new shard into the state at borg_snn.pt without overwriting prior knowledge. The LM head and tokenizer id from the first source model are persisted alongside the merged weights so the UI can decode text immediately after a single assimilation round. The manifest at borg_snn.pt.manifest.json accumulates every absorbed model (dedup'd by model_id), and each model's detected head type is registered under state_dict[HEADS_KEY] so the runtime can dispatch per-model capabilities.

Absorb a large SNN on Colab

For checkpoints too big for a free GitHub Actions runner (e.g. Nord at 13 GB) absorb on Colab instead: it has 100+ GB disk and a persistent session.

Open In Colab

The notebook downloads the source repo, mmaps each safetensors shard via safe_open, drops oversized cross-architecture tensors (LM head, embeddings) automatically, runs assimilate_state_dict per shard, and uploads the resulting borg_snn.pt straight to your HF Space.

Join the collective

Every clone is a peer. The collective layer (borg.collective) turns local absorptions into signed envelopes addressed by content hash, and synchronises them pairwise with known peers. The HuggingFace Space is a seed peer in the default list -- not a central master.

from borg.collective import Peer, collective_assimilate

peer = Peer(peer_id="my-node")
result = collective_assimilate(
    state_dict=my_state,
    model_id="org/model",
    local_peer=peer,
)
# result.cids_broadcast holds the Merkle-DAG ids you just published

If model_id has already been absorbed by any peer in the local roster, collective_assimilate short-circuits -- one million clones each running collective_assimilate("gpt2") produces one logical absorption network-wide. See docs/collective.md.

Run the worker

pip install -e .[worker]
uvicorn borg.worker:create_app --factory --host 0.0.0.0 --port 8000

The worker exposes /contribute (accepts Ed25519-signed SparseDelta envelopes), /gossip (exchanges digests with peer workers), and /health. Envelope schema is in proto/contribution.schema.json. Running the worker turns your clone into a reachable peer; the borg.collective.http_sync client-side gossip then converges against you alongside every other peer on the network.

Benchmark the MBE conversion

python scripts/bench_mbe.py --dim 32 --samples 32

Reports Spearman, Pearson, and cosine correlation between a reference transformer and its MBE-converted SNN on random inputs. See src/borg/bench.py for the harness.

Upstream

crdt-merge provides the OR-Set, Merkle-provenance, and E4 trust primitives this repository depends on. It is licensed BUSL-1.1 (change date 2028-03-29, change license Apache-2.0) with patents held separately (UK Application Nos. 2607132.4, GB2608127.3). See docs/adrs/0006-license-considerations.md for the implications.

License

the-borg-project is licensed under the Business Source License 1.1, mirroring the upstream crdt-merge model. The BUSL Additional Use Grant permits all embedded use (libraries, SaaS, research, internal tooling, commercial products) and blocks only the resale of the-borg-project itself as a competing merge / assimilation / collective-gossip service.

On 2028-03-29 the licence automatically converts to Apache License, Version 2.0.

See LICENSE for the full terms, NOTICE for third-party attributions, and docs/adrs/0006-license-considerations.md for the rationale.

For commercial licensing of out-of-scope uses: rgillespie83@icloud.com / data@optitransfer.ch.

Contributing

See CONTRIBUTING.md. The branch protocol, identity lock, and pre-push forensic audit in scripts/pre-push-audit.sh are mandatory for every contribution.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

the_borg_project-0.1.0.tar.gz (115.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

the_borg_project-0.1.0-py3-none-any.whl (87.8 kB view details)

Uploaded Python 3

File details

Details for the file the_borg_project-0.1.0.tar.gz.

File metadata

  • Download URL: the_borg_project-0.1.0.tar.gz
  • Upload date:
  • Size: 115.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for the_borg_project-0.1.0.tar.gz
Algorithm Hash digest
SHA256 0ebe78339f844506015f995d9447c5a9c0560dbfbdb17f15869aa4e510624d2f
MD5 4c6768417599a0ead96b038e201d5b9b
BLAKE2b-256 8008cddc6259110396dd5743661b380a9add2dd38bce154fd08abab32e8fe5a5

See more details on using hashes here.

File details

Details for the file the_borg_project-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for the_borg_project-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6a2217e5c2c69fc1cd2d3d110aed6e11cbbee7e0dca24704dc4524dbe7f5e447
MD5 dd03d9ae55b1012758f767686afa47bf
BLAKE2b-256 91cf434a7df51e7b81c61da80e9afb43e0e4f0496cc1811161dab7d6ea8725ea

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page