Polymathera's no-RAG, multi-agent framework for extremely long, dense contexts (1B+ tokens).
Project description
Colony
A no-RAG, cache-aware multi-agent framework for extremely long, dense contexts (1B+ tokens).
Colony is a framework for building tightly-coupled, self-improving, self-aware multi-agent systems (agent colonies) that reason over extremely long context without retrieval-augmented generation (RAG). Instead of fragmenting context into chunks and retrieving snippets, Colony keeps the entire context live across a cluster of LLMs through a virtual memory system that manages GPU KV caches the same way an operating system manages (almost unlimited) virtual memory over finite physical memory.
| ๐ก Colony's Vision |
| Colony's goal is to be the most efficient country of geniuses in a datacenter โ the ideal substrate for civilization-building AI. |
| โ ๏ธ Pre-Alpha Early Access |
| Colony is still in pre-alpha early access. The API is not stable and the framework is under active development. We welcome feedback and contributions, but be aware that breaking changes may occur. |
| โน๏ธ Who should use Colony? |
| Colony is designed for engineers building complex multi-agent systems that require reasoning over extremely long contexts. It is not a general-purpose agent framework or a consumer product. If you are looking for a simple agent orchestration tool or a way to add tool use to an LLM, Colony may not be the right fit. It runs over a Ray cluster (local or in the cloud) and it can be resource-intensive and expensive. |
Why Colony?
Most agent frameworks treat context as something to retrieve or manage. Colony treats it as something to be brought to life. Certain domains require reasoning deep and wide. Examples include:
- Scientific research: synthesizing novel insights from a vast literature requires complex integration
- Cyber-physical systems: understanding the full context of a complex system (code, physical environment, requirements, regulations) is essential for architecting solutions and identifying edge cases and failure modes
- Systemic vulnerability analysis: identifying security risks in a complex system by reasoning over a large attack surface and many potential interactions.
- Business intelligence: making strategic decisions based on a wide range of internal and external data, where relevant information may be siloed and require cross-domain reasoning
- Economic modeling: simulating and understanding complex economic systems with many interacting agents and factors and long supply chains
- Long-form content creation: writing a book or comprehensive report that requires maintaining a coherent narrative across a large amount of information
Colony's core innovations are:
-
NoRAG -- Colony keeps the full context live and accessible, not filtered through retrieval. Colony manages all kinds of context (code, text, data) through distributed KV cache paging, not vector search.
-
Cache-Aware Agents -- Agents are aware of what's in GPU memory (at the cluster level) and consciously plan their work to maximize cache reuse.
-
Agents All the Way Down -- General intelligence emerges from the right composition of agent capabilities and multi-agent patterns. Every cognitive process -- attention, memory, planning, confidence tracking -- is a pluggable policy with a default implementation.
-
Distributed Reasoning Patterns -- Multi-agent game protocols (hypothesis games, contract nets, negotiation) combat specific LLM failure modes: hallucination, laziness, and goal drift.
Read the full Philosophy for the ideas behind the framework.
P.S. Colony does not preclude agents from using retrieval or vector search -- those can be implemented as capabilities that agents use when appropriate. Colony's point is that retrieval is not the only way to manage long context, and for certain domains, it's not the best way.
Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Agent Colony โ
โ โ
โ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โ Agent 1 โ โ Agent 2 โ โ Agent N โ โ
โ โ Capabilities โ โ Capabilities โ โ Capabilities โ ... โ
โ โ Action Policy โ โ Action Policy โ โ Action Policy โ โ
โ โ Planner (LLM) โ โ Planner (LLM) โ โ Planner (LLM) โ โ
โ โโโโโโโโฌโโโโโโโโโโ โโโโโโโโฌโโโโโโโโโโ โโโโโโโโฌโโโโโโโโโโ โ
โ read/write/query/mmap โ โ โ infer_with_suffix โ
โ โโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโดโโโโโโโ page_graph_ops โ
โ โ โผ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ Blackboard (Redis) โ โ Virtual Context Memory (VCM) โ โ
โ โ โ โ โ โ โ
โ โ โโโโบโ Shared state & events โ โ Page Table ยท Page Graph โ โ
โ โ โ OCC ยท Memory scopes โ โ Cache Scheduling ยท Page Faults โ โ
โ โ โ Agent coordination โ โ โ โ
โ โ โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโ โ โโโโโโโโโโโโ โโโโโโโโโโโโ โ โ
โ โ โ โ โ LLM N1 โ โ LLM N2 โ ... โ โ
โ โ โ mmap/munmap/invalidate โ โ KV Cache โ โ KV Cache โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโบโ โโโโโโโโโโโโ โโโโโโโโโโโโ โ โ
โ โ mmap/munmap/invalidate โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโบโ โ โ
โ โ โ โ Context Sources (mapped as pages): โ โ
โ โ โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโ โ โโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโ โ โ
โ โ โ External Sources โ โ โ Repos โ โKnowledge โ โBlackbrd โ โ โ
โ โโโโโบโ Git repos, documents, โ โ โ โ โ Bases โ โ Data โ โ โ
โ โ knowledge bases, data โ โ โโโโโโโโโโ โโโโโโโโโโโโ โโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Each Agent composes pluggable capabilities (memory, attention, games, confidence tracking, grounding, reflection, cache awareness, etc.) coordinated by an ActionPolicy that consults an LLM Planner. Agents share state through a Redis-backed Blackboard with optimistic concurrency control (OCC) and causal ordering. The Virtual Context Memory (VCM) manages distributed GPU KV caches as pages, enabling agents to reason over contexts far larger than any single model's window.
See the full Architecture docs.
Quick Start
Installation
pip install polymathera-colony
With optional extras:
pip install polymathera-colony[code_analysis] # Code analysis tools
pip install polymathera-colony[gpu] # GPU inference (vLLM, PyTorch)
pip install polymathera-colony[cpu] # CPU-only inference (Anthropic API)
pip install polymathera-colony --all-extras # Everything
Local Test Environment
Colony ships with colony-env, a CLI tool that spins up a local Ray cluster + Redis using Docker Compose. The only prerequisite is Docker.
# Start the cluster (builds image on first run)
colony-env up
# Generate a sample analysis config
polymath init-config --output my_analysis.yaml
# Run a code analysis over a local codebase
colony-env run /path/to/codebase --config my_analysis.yaml
# Check service status
colony-env status
# Open the web dashboard
colony-env dashboard
# Scale workers
colony-env up --workers 3
# Tear down
colony-env down
# Verify prerequisites
colony-env doctor
All Colony dependencies run inside Docker -- no local GPU drivers, Ray, or Redis installation required. The colony-env run command copies your codebase to be analyzed into the cluster and executes inside the Ray head container with full access to the framework.
Services started by colony-env up:
| Service | Port | Description |
|---|---|---|
| Colony dashboard | localhost:8080 |
Web UI for agents, sessions, VCM |
| Ray dashboard | localhost:8265 |
Cluster monitoring UI |
| Ray client | localhost:10001 |
Ray client connection |
| Redis | localhost:6379 |
State management backend |
Web Dashboard
The Colony dashboard starts automatically with colony-env up at localhost:8080. It provides:
- Overview โ cluster health, application deployments, quick stats
- Agents โ list registered agents, view state, capabilities, and details
- Sessions โ browse sessions and their agent runs with token usage
- VCM โ page table, working set, and virtual context statistics
- Traces โ detailed tracing of agent actions, VCM operations, and system events for debugging and performance analysis
# Run the agent colony
colony-env down && colony-env up --workers 3 && colony-env run --local-repo /path/to/codebase --config my_analysis.yaml --verbose
# Open the dashboard in your browser
colony-env dashboard
# Use a custom port (must match COLONY_DASHBOARD_UI_PORT)
colony-env dashboard --port 9090
For frontend development, run the Vite dev server on the host with hot-reload:
cd src/polymathera/colony/web_ui/frontend
npm install
npm run dev # Starts on localhost:5173, proxies /api to localhost:8080
Key Features
| Feature | Description | Docs |
|---|---|---|
| Virtual Context Memory | OS-style virtual memory for LLM KV caches with page tables and cache-aware scheduling | VCM |
| Agent Capabilities | Composable cognitive modules (memory, attention, games, confidence) attached to agents via AOP-inspired patterns | Agent System |
| Action Policies | LLM-centric planning with Model Predictive Control -- the LLM is the planner, not the framework | Action Policies |
| Blackboard | Redis-backed shared state with optimistic concurrency, causal timelines, and event-driven coordination | Blackboard |
| Memory Hierarchies | Unified memory system with sensory, working, short-term, and long-term memory -- all backed by blackboards | Memory |
| Game Engine | Hypothesis games, contract nets, negotiation, and consensus protocols for multi-agent coordination | Games |
| Hook System | AOP-inspired hooks for cross-cutting concerns (logging, tracing, metrics, memory triggers) | Hooks |
Development
git clone https://github.com/polymathera/colony.git
cd colony
poetry install --all-extras
Running Tests
pytest src/ --timeout=120 -x -q
Documentation
poetry run mkdocs serve --livereload # Local docs server at http://127.0.0.1:8000/
poetry run mkdocs build # Build static site
Contributing
We welcome contributions. See CONTRIBUTING.md for development setup, code conventions, and the PR process.
License
Apache 2.0 -- see LICENSE.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file polymathera_colony-0.1.2.tar.gz.
File metadata
- Download URL: polymathera_colony-0.1.2.tar.gz
- Upload date:
- Size: 1.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9fcfb4cc6088758955242f9407b1454d4ca8732860f0be6132f3b53bfc4640a5
|
|
| MD5 |
cb11b588e212a41c1363e522768e2fba
|
|
| BLAKE2b-256 |
be213f671d8c28920246de7a46c8768acb4598ef48ed0a257fa8aa232fdf1b44
|
Provenance
The following attestation bundles were made for polymathera_colony-0.1.2.tar.gz:
Publisher:
publish.yml on Polymathera/colony
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polymathera_colony-0.1.2.tar.gz -
Subject digest:
9fcfb4cc6088758955242f9407b1454d4ca8732860f0be6132f3b53bfc4640a5 - Sigstore transparency entry: 1109812319
- Sigstore integration time:
-
Permalink:
Polymathera/colony@899af031ea3a1769deab0562964a2b2d09248657 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/Polymathera
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@899af031ea3a1769deab0562964a2b2d09248657 -
Trigger Event:
push
-
Statement type:
File details
Details for the file polymathera_colony-0.1.2-py3-none-any.whl.
File metadata
- Download URL: polymathera_colony-0.1.2-py3-none-any.whl
- Upload date:
- Size: 1.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
92ed11a7bf5a558e0585901a718227cae27c51d9a26e4e9bf6336885ac1690f1
|
|
| MD5 |
b447d40f8e84694ad5fd9c0d705c0467
|
|
| BLAKE2b-256 |
909a56648143263f94551482d87ee5c232ca519651b0dac8bb3cb9035cf51f30
|
Provenance
The following attestation bundles were made for polymathera_colony-0.1.2-py3-none-any.whl:
Publisher:
publish.yml on Polymathera/colony
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
polymathera_colony-0.1.2-py3-none-any.whl -
Subject digest:
92ed11a7bf5a558e0585901a718227cae27c51d9a26e4e9bf6336885ac1690f1 - Sigstore transparency entry: 1109812325
- Sigstore integration time:
-
Permalink:
Polymathera/colony@899af031ea3a1769deab0562964a2b2d09248657 -
Branch / Tag:
refs/tags/v0.1.2 - Owner: https://github.com/Polymathera
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish.yml@899af031ea3a1769deab0562964a2b2d09248657 -
Trigger Event:
push
-
Statement type: