Universal model assimilation into a sparse spiking neural state built on crdt-merge.
Project description
The Borg Project
Universal model assimilation into a sparse, brain-like spiking neural
superintelligence, built on crdt-merge.
Any existing model (RoBERTa, DeepSeek-Coder, Phi-3, a ViT, or another SNN variant) can be converted into a sparse spiking shard and merged into a single growing state. Every merge is mathematically conflict-free, intelligence is preserved, and the resulting model is event-driven, quantisable to INT8, and runnable on commodity CPUs or phones.
Three pillars plus the collective layer
- Universal absorption. Training-free ANN-to-SNN conversion turns any model checkpoint into a compatible sparse contribution. No fine-tuning.
- Spike-preserving merge. A sparse-delta adapter on top of
crdt-merge's OR-Set, Merkle tree, provenance log, and E4 trust lattice treats individual weight deltas, timing traces, and activation events as discrete items, so naive averaging never blurs spike timings into oblivion. - Predictive refinement. An optional variational free-energy step after CRDT resolve tunes thresholds and timings to minimise prediction error, mirroring biological active inference.
Plus the collective gossip layer: every clone is a full peer on a
Merkle-DAG of signed contribution envelopes. Pairwise gossip converges
the global state with no central master, no rate limits, and automatic
deduplication of redundant absorptions. See docs/collective.md and
borg.collective.
Plus the capability registry: every absorbed model is tagged with
its head type (causal_lm, masked_lm, embedder, classifier,
unknown) and persisted on the state. At query time,
borg.decode.universal_decode dispatches to the correct decoder path
automatically -- so the Borg produces a readable answer whether you've
absorbed a causal LM, a sentence embedder, a classifier, or any
combination. The manifest is cumulative across absorbs: the diagnostic
panel shows every model ever absorbed, not just the latest round.
Layout
the-borg-project/
docs/
vision.md, architecture.md, plain-english.md
roadmap.md, sprint-plan.md, waterfall.md, status-matrix.md
glossary.md, world-first.md, brain.md, convergence-path.md
neurocrdt-principles.md, collective.md
specs/ per-pillar technical specs
adrs/ architecture decision records
src/borg/
assimilation.py top-level assimilate-and-merge entry
convert.py ANN-to-SNN conversion (feedforward + MBE)
sparse.py sparse-delta envelope with timing trace
merge.py merge orchestration + probe gate + cumulative manifest
e4.py crdt_merge.e4 trust-lattice wiring
fep.py variational free-energy refinement
calibration.py post-merge per-vocab logit rescaling
inference.py event-driven sparse SNN forward pass
decode.py token-level + universal (capability-agnostic) decoder
heads.py capability registry (causal_lm / masked_lm / embedder / classifier / unknown)
bench.py fidelity benchmarks (MBE vs reference)
rag/ document ingest and retrieval
speech.py microphone transcription adapter
app.py Gradio demo with chain-of-thought prompt
worker.py FastAPI worker + Ed25519 envelopes
collective/ P2P gossip layer (CID, Bloom, Peer, sync)
proto/ JSON schemas for contribution envelopes
examples/end-to-end/ local simulation scripts + HF Space config
examples/colab/ Colab notebooks for Nord + continuous absorb
scripts/ hooks, audit, benchmarks, Space wipe
tests/ unit + regression tests
.github/workflows/ CI (lint, type check, tests, audit, benches)
Status
Private, pre-publication. The scaffold is real code backed by tests. Every
closed gap is documented in docs/status-matrix.md. See TASKS.md for the
phased plan, docs/roadmap.md for the delivery calendar, and
docs/sprint-plan.md for the two-week sprint breakdown.
CI runs ruff, mypy, and pytest on Python 3.10, 3.11, and 3.12, plus a dedicated forensic audit and JSON-schema validation.
Install
pip install the-borg-project
# or with optional extras
pip install "the-borg-project[rag,speech,app,convert,worker]"
Quick start (from source for development)
python -m venv .venv
source .venv/bin/activate
pip install -e .[dev]
bash scripts/install-hooks.sh
bash scripts/run-tests.sh
Run the demo
python -m borg.app
Opens a local Gradio UI with chat, document upload (RAG), and microphone
input. The chat dispatches through borg.decode.universal_decode, which
routes per registered head type (causal LM → token decode; masked LM →
mask-fill; classifier → class activations; embedder → RAG-grounded
answer; unknown → three-step CoT report with a latent signature). The
diagnostic panel at the top of the UI shows every absorbed model with
its head type, layer count, parameter count, and per-model sparsity.
Assimilate a model
from borg.assimilation import assimilate
assimilate(
model_ids=["FacebookAI/roberta-base"],
borg_path="borg_snn.pt",
)
Every call merges the new shard into the state at borg_snn.pt without
overwriting prior knowledge. The LM head and tokenizer id from the first
source model are persisted alongside the merged weights so the UI can
decode text immediately after a single assimilation round. The manifest
at borg_snn.pt.manifest.json accumulates every absorbed model
(dedup'd by model_id), and each model's detected head type is
registered under state_dict[HEADS_KEY] so the runtime can dispatch
per-model capabilities.
Absorb a large SNN on Colab
For checkpoints too big for a free GitHub Actions runner (e.g. Nord at 13 GB) absorb on Colab instead: it has 100+ GB disk and a persistent session.
The notebook downloads the source repo, mmaps each safetensors shard via
safe_open, drops oversized cross-architecture tensors (LM head,
embeddings) automatically, runs assimilate_state_dict per shard, and
uploads the resulting borg_snn.pt straight to your HF Space.
Join the collective
Every clone is a peer. The collective layer (borg.collective) turns
local absorptions into signed envelopes addressed by content hash, and
synchronises them pairwise with known peers. The HuggingFace Space is
a seed peer in the default list -- not a central master.
from borg.collective import Peer, collective_assimilate
peer = Peer(peer_id="my-node")
result = collective_assimilate(
state_dict=my_state,
model_id="org/model",
local_peer=peer,
)
# result.cids_broadcast holds the Merkle-DAG ids you just published
If model_id has already been absorbed by any peer in the local
roster, collective_assimilate short-circuits -- one million clones
each running collective_assimilate("gpt2") produces one logical
absorption network-wide. See docs/collective.md.
Run the worker
pip install -e .[worker]
uvicorn borg.worker:create_app --factory --host 0.0.0.0 --port 8000
The worker exposes /contribute (accepts Ed25519-signed SparseDelta
envelopes), /gossip (exchanges digests with peer workers), and
/health. Envelope schema is in proto/contribution.schema.json.
Running the worker turns your clone into a reachable peer; the
borg.collective.http_sync client-side gossip then converges against
you alongside every other peer on the network.
Benchmark the MBE conversion
python scripts/bench_mbe.py --dim 32 --samples 32
Reports Spearman, Pearson, and cosine correlation between a reference
transformer and its MBE-converted SNN on random inputs. See
src/borg/bench.py for the harness.
Upstream
crdt-merge provides the OR-Set, Merkle-provenance, and E4 trust primitives
this repository depends on. It is licensed BUSL-1.1 (change date 2028-03-29,
change license Apache-2.0) with patents held separately (UK Application
Nos. 2607132.4, GB2608127.3). See docs/adrs/0006-license-considerations.md
for the implications.
License
the-borg-project is licensed under the Business Source License 1.1, mirroring the upstream crdt-merge model. The BUSL Additional Use Grant permits all embedded use (libraries, SaaS, research, internal tooling, commercial products) and blocks only the resale of the-borg-project itself as a competing merge / assimilation / collective-gossip service.
On 2028-03-29 the licence automatically converts to Apache License, Version 2.0.
See LICENSE for the full terms, NOTICE for third-party attributions,
and docs/adrs/0006-license-considerations.md for the rationale.
For commercial licensing of out-of-scope uses:
rgillespie83@icloud.com / data@optitransfer.ch.
Contributing
See CONTRIBUTING.md. The branch protocol, identity lock, and pre-push
forensic audit in scripts/pre-push-audit.sh are mandatory for every
contribution.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file the_borg_project-0.1.0.tar.gz.
File metadata
- Download URL: the_borg_project-0.1.0.tar.gz
- Upload date:
- Size: 115.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0ebe78339f844506015f995d9447c5a9c0560dbfbdb17f15869aa4e510624d2f
|
|
| MD5 |
4c6768417599a0ead96b038e201d5b9b
|
|
| BLAKE2b-256 |
8008cddc6259110396dd5743661b380a9add2dd38bce154fd08abab32e8fe5a5
|
File details
Details for the file the_borg_project-0.1.0-py3-none-any.whl.
File metadata
- Download URL: the_borg_project-0.1.0-py3-none-any.whl
- Upload date:
- Size: 87.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6a2217e5c2c69fc1cd2d3d110aed6e11cbbee7e0dca24704dc4524dbe7f5e447
|
|
| MD5 |
dd03d9ae55b1012758f767686afa47bf
|
|
| BLAKE2b-256 |
91cf434a7df51e7b81c61da80e9afb43e0e4f0496cc1811161dab7d6ea8725ea
|