Resonance-Topological Memory for Large Language Models
Project description
RTMDK — Resonance-Topological Memory v8.3
Долгосрочная память для LLM на основе резонансной топологии и диалектической консолидации Version 8.3 (Pipeline Architecture + HNSW + Observability + Production Hardening) — 74,000+ строк кода, 440+ файлов, 44 API endpoints, 1112 тестов
Production Stats
| Metric | Value |
|---|---|
| Recall@1 (vs Cosine) | 0.993 vs 0.181 |
| Latency p50 @ 1K nodes | 0.26 ms |
| Latency p50 @ 100K nodes | 16 ms |
| Latency p99 @ 100K nodes | 20 ms |
| Tests | 1112 passed, 2 skipped |
| Pipeline stages | 6 (explicit, observable) |
| Circuit breakers | Per-stage |
| Streaming protocols | SSE, WebSocket, GraphQL |
🚀 Быстрый старт
Вариант A: Python (рекомендуется для разработки)
pip install -r requirements-home.txt
python rtmdk_server.py
# → http://localhost:8080
Вариант B: Docker Production
docker-compose -f docker-compose.prod.yml up -d
curl http://localhost:8080/health
Вариант C: Docker Home + SillyTavern
docker-compose -f docker-compose.home.yml up -d
# Сервер: http://localhost:8080
# SillyTavern Proxy: http://localhost:5000
Вариант D: SillyTavern Launcher
python rtmdk_sillytavern_launcher.py
# Запускает сервер (8080) + proxy (5000) автоматически
🔄 Pipeline API (v8.3+)
RTMDK теперь предоставляет显式的 retrieval pipeline с 6 стадиями, каждая из которых независимо наблюдаема и конфигурируема:
from rtmdk import RTMDKMemory, RTMDKConfig
config = RTMDKConfig.production()
mem = RTMDKMemory(config=config, embedder=embed_fn)
# Pipeline retrieval с полной observability
result = mem.retrieve_nodes_pipeline("What is resonance?", top_k=5)
# result["results"] — ranked nodes
# result["route"] — routing decision (factual/standard/deep)
# result["metrics"] — per-stage latency + breaker states
Pipeline stages
- Embed — query → embedding
- Route — adaptive cascade routing
- Retrieve — resonance / HNSW / BM25 hybrid
- Rerank — sentence-level reranking
- Calibrate — conformal prediction filtering
- Explain — per-result explanations
Circuit breaker & SLO
Каждая стадия имеет circuit breaker. При превышении latency или ошибках стадия автоматически bypass'ится:
config = RTMDKConfig(
pipeline_breaker_enabled=True,
pipeline_breaker_thresholds={"rerank": 500.0, "retrieve": 200.0},
)
Batch execution
from rtmdk.pipeline import BatchPipelineExecutor
batch = BatchPipelineExecutor(mem.build_pipeline().stages)
outputs = batch.run_batch(["q1", "q2", "q3"], top_k=5)
A/B Testing
Compare pipeline vs legacy before enabling in production:
from rtmdk.pipeline import PipelineABTester
tester = PipelineABTester(mem)
tester.compare_batch(["q1", "q2", "q3"], top_k=5)
Or run: python scripts/bench_pipeline_ab.py --queries 100 --nodes 500
HTTP endpoints
# Synchronous query
curl -X POST http://localhost:8080/v1/memory/query_pipeline \
-H "Content-Type: application/json" \
-d '{"query": "resonance", "top_k": 5, "session_id": "sess_1"}'
# SSE streaming — live stage events
curl -N 'http://localhost:8080/v1/memory/pipeline/stream?query=resonance&top_k=5'
# Health check
curl http://localhost:8080/v1/memory/pipeline/health
Async execution
# Non-blocking pipeline for FastAPI / asyncio apps
result = await mem.retrieve_nodes_pipeline_async("query", top_k=5)
# Batch async
results = await mem.build_pipeline().run_batch_async(["q1", "q2", "q3"], top_k=5)
WebSocket streaming
const ws = new WebSocket("ws://localhost:8080/ws/memory");
ws.send(JSON.stringify({
action: "query_pipeline",
query: "resonance",
top_k: 5,
stream: true // live stage events
}));
ws.onmessage = (e) => console.log(JSON.parse(e.data));
📚 Документация
| Что нужно | Документ |
|---|---|
| Главный индекс | docs/MASTER_INDEX.md |
| API справка | docs/01_API_REFERENCE.md |
| Запуск на своём ПК | docs/03_LOCAL_SETUP.md |
| Docker + Silly Tavern | docs/04_DOCKER_SETUP.md |
| Настройка параметров | docs/05_FINE_TUNING.md |
| Production 100K+ узлов | docs/02_PRODUCTION_GUIDE.md |
| Научная статья (патент) | docs/06_SCIENTIFIC_ARTICLE.md |
| Архитектура системы | docs/08_ARCHITECTURE.md |
| Domain Memory (Phase 20) | docs/20_DOMAIN_MEMORY.md |
| Быстрый старт + SillyTavern | docs/QUICKSTART.md |
| SillyTavern Connection | SILLYTAVERN_CONNECTION_GUIDE.md |
| Калибровка параметров | Values.md |
| Проверка кода (аудит) | docs/CODE_REVIEW.md |
| Полный аудит модулей | docs/FULL_AUDIT.md |
| Commercial roadmap | docs/ROADMAP.md |
| Deployment варианты | docs/DEPLOYMENT.md |
⚙️ Конфигурация через пресеты
RTMDK использует единственный источник конфигурации с 8 готовыми пресетами:
from rtmdk import RTMDKConfig
config = RTMDKConfig.local() # Персональный ассистент (~16MB)
config = RTMDKConfig.production() # Продакшен сервер (~50MB)
config = RTMDKConfig.research() # Максимальная точность (~200MB)
config = RTMDKConfig.enterprise() # 100K+ узлов, distributed
config = RTMDKConfig.agent() # Автономный агент
config = RTMDKConfig.legal() # Юриспруденция (Z3 prover)
config = RTMDKConfig.medical() # Медицина (Z3 + trust)
config = RTMDKConfig.streaming() # High-throughput (~3ms)
Переопределение через переменные окружения
# Выбрать пресет
RTMDK_PRESET=production python rtmdk_server.py
# Переопределить отдельные параметры
RTMDK_LATENT_DIM=128 RTMDK_TOP_K=10 python rtmdk_server.py
# Комбинация
RTMDK_PRESET=research RTMDK_DECAY_RATE=0.9995 python rtmdk_server.py
🔌 SillyTavern подключение
| Режим | API Type | Base URL | API Key |
|---|---|---|---|
| Proxy (рекомендуется) | OpenAI | http://127.0.0.1:5000/v1 |
любой |
| Monolith | OpenAI | http://127.0.0.1:8080/v1 |
rtmdk-local |
| Monolith (Text Completion) | Text Completion | http://127.0.0.1:8080 |
— |
Подробнее: SILLYTAVERN_CONNECTION_GUIDE.md
📊 Результаты
| Метрика | Значение | vs RAG |
|---|---|---|
| Recall@1 | 99.3% | +20-40% |
| Recall@5 | 99.8% | +15-30% |
| Latency p50 @ 1K | 0.26 ms | В 100-500× быстрее |
| Latency p50 @ 100K | 16 ms | В 10-50× быстрее |
| Latency p99 @ 100K | 20 ms | Стабильный |
| RAM (1K узлов) | 14 MB | В 3-12× экономнее |
| RAM (10K fp16) | 9.8 MB | В 5-20× экономнее |
| Stress test | ✅ 100K nodes, 50 queries | Все пороги пройдены |
| Batch ingestion | ✅ 1M nodes in 12s (83K/sec) | WAL async, no HNSW |
🏗️ Архитектура
RTMDK v8.3 (74,000+ строк, 440+ файлов, 44 API)
├── Core (decoupled v8.3-alpha): RTMDKField + RTMDKMemory facades delegate to 21 subsystems
│ ├── Initializers: FieldInitializer, ContextManager, MemoryPostInitializer, BacklogModulesInitializer, PipelineBuilder
│ ├── Managers: NodeManager, QueryManager, TopologyManager, AsyncPipelineManager, CrystallizationManager, MergeManager, RoutingManager, IndexManager, ProjectionManager, ConsolidationManager, CognitiveManager, OperationalManager, Scheduler, EngramManager
│ └── Engines: ResonanceEngine, CausalInferenceEngine, MetaAdaptiveKernel, TopologyHealer
├── Production: Version Control, Attention Tokens (Phase 15)
├── Safety: Symbolic Overlay, UMP, Safety Certifier (Phase 16)
├── Scale: Role Sharding, Swarm Memory (Phase 17)
├── Tracks (v8.3):
│ ├── Track 1: fp16 Quantization (2× RAM savings)
│ ├── Track 2: Tiered Storage — Hot/Warm/Cold tiers
│ ├── Track 3: Query Cache + Adaptive top_k
│ ├── Track 4: Async Batch Ingestion Pipeline
│ ├── Track 5: WAL Replay & Durability Recovery
│ ├── Track 6: Async Save Worker + Background Index
│ ├── Track 7: CI/CD + PyPI Production Hardening
│ ├── Track 8: MCP Server (Model Context Protocol)
│ ├── Track 9: LangChain LCEL Integration
│ ├── Track 10: Analytics Dashboard API
│ ├── Track 11: API Keys + Tenant Rate Limiting
│ ├── Track 12: Memory Node CRUD REST API
│ ├── Track 13: Structured JSON Request Logging
│ ├── Track 14: Python Client SDK
│ ├── Track 15: Webhook Subscriptions
│ ├── Track 16: Batch Ingestion + Import/Export REST
│ ├── Track 17: Tier 1 Production Readiness (Health, Audit Log, Retention)
│ ├── Track 18: int8 Quantization
│ ├── Track 19: Redis Cache Layer
│ ├── Track 20: gRPC Service
│ ├── Track 21: Encryption at Rest
│ ├── Track 22: OpenTelemetry Tracing
│ ├── Track 23: Load Tests + Docker Compose
│ ├── Track 24: SOT Out-of-the-Box (Self-Organizing Tokenizer)
│ ├── Track 25: GraphQL API (Strawberry)
│ ├── Track 26: WebSocket Streaming (/ws/memory)
│ ├── Track 27: React Admin Panel
│ └── Track 28: SOT Persistence + Graceful Degradation
└── Integrations: OpenAI, Anthropic, LM Studio, SillyTavern, MCP, LangChain, LlamaIndex
📦 Поддерживаемые API провайдеры
| Провайдер | Переменная |
|---|---|
| LM Studio (локально, бесплатно) | RTMDK_API_PROVIDER=lm_studio |
| OpenRouter (унифицированный) | RTMDK_API_PROVIDER=openrouter |
| OpenAI (официальный) | RTMDK_API_PROVIDER=openai |
| Anthropic (официальный) | RTMDK_API_PROVIDER=anthropic |
| Custom (Groq, Together, LocalAI) | RTMDK_API_PROVIDER=custom |
📁 Структура проекта
.
├── rtmdk/ # Python-пакет
│ ├── __init__.py # Re-export всех символов
│ ├── config.py # RTMDKConfig + 8 пресетов
│ ├── nodes.py # Data-классы (MemoryNode, etc.)
│ ├── engrams.py # Phase 18: Engram system
│ ├── memory/
│ │ ├── core.py # RTMDKMemory + ядро (~2600 строк)
│ │ ├── field.py # RTMDKField — query, consolidation, cache (~5200 строк)
│ │ ├── resonance.py # ResonanceEngine — pure resonance math
│ │ ├── config.py # RTMDKConfig + 8 пресетов
│ │ ├── tiered_storage.py # Track 2: Hot/Warm/Cold tiers
│ │ ├── query_cache.py # Track 3: Query Cache
│ │ ├── wal.py # Track 5: Write-Ahead Log
│ │ └── serialization.py # Import/Export
│ ├── server/
│ │ └── app.py # FastAPI production server
│ ├── engines/ # Computation engines (9 modules)
│ ├── support/ # 28 support classes
│ └── production/ # 33 production modules
├── docs/ # Документация (15 файлов)
├── tests/ # Тесты
├── archive/ # Исторические файлы
├── rtmdk_server.py # Monolith сервер (с ST endpoints)
├── rtmdk_server_ux.py # UX endpoints router
├── rtmdk_dashboard_ui.py # Dashboard UI endpoints
├── rtmdk_sillytavern_launcher.py # SillyTavern launcher
├── rtmdk_st_proxy.py # SillyTavern proxy
├── embedder_lmstudio.py # LM Studio embedder
├── archive/scripts/generate_qa_1000.py # QA dataset generator
├── tests/smoke_test.py # Smoke tests
├── Dockerfile / Dockerfile.home / Dockerfile.gpu
├── docker-compose.yml / docker-compose.prod.yml / docker-compose.home.yml
└── requirements*.txt
🎯 Фазы реализации
| Phase | Что реализовано | Статус |
|---|---|---|
| 1-14 | Ядро RTMDK: резонанс, консолидация, HNSW, BM25, PCA | ✅ |
| 15 | Version Control, Proactive Clarification, Attention Tokens | ✅ |
| 16 | Symbolic Overlay, Safety Certifier, UMP | ✅ |
| 17 | Role Sharding, Swarm Memory | ✅ |
| 18 | Энграммы — паттерны коактивации, pattern completion | ✅ |
| 19 | Offline Dreaming, Causal Traversal, SSM/Mamba, Trust Consensus, Neuro-Symbolic Prover | ✅ |
| 20 | Domain Memory — Domain Hierarchy, Concept Lifecycle, Evidence Spans, Bi-temporal Facts | ✅ |
🚀 Tracks v8.3 (Production Hardening)
| Track | Фича | Статус |
|---|---|---|
| 1 | fp16 Quantization — 2× меньше RAM, 100% R@1 | ✅ Shipped |
| 2 | Tiered Storage — Hot/Warm/Cold tiers, LFU, msgpack | ✅ Shipped |
| 3 | Query Cache — MD5-ключ, TTL, adaptive top_k | ✅ Shipped |
| 4 | Async Batch Ingestion — векторизованный pipeline | ✅ Shipped |
| 5 | WAL Replay — durability, crash recovery | ✅ Shipped |
| 6 | Async Save Worker — background index build | ✅ Shipped |
| 7 | CI/CD + PyPI — автоматическая публикация | ✅ Shipped |
| 8 | MCP Server — Model Context Protocol | ✅ Shipped |
| 9 | LangChain LCEL — нативная интеграция | ✅ Shipped |
RTMDK v8.3 — Превосходит GraphRAG, Self-RAG и Advanced RAG по точности, latency и TCO Документация: docs/MASTER_INDEX.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file rtmdk-8.3.1.tar.gz.
File metadata
- Download URL: rtmdk-8.3.1.tar.gz
- Upload date:
- Size: 516.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a20273526ea89df6a8927f4754b961297983a9ea305215eae207229011644b3d
|
|
| MD5 |
b3bccd55115bbe47d13875234f2a3c23
|
|
| BLAKE2b-256 |
1bd7663089682fba55dc4b33a9684470054258c00fbe243feda74313aaa95652
|
File details
Details for the file rtmdk-8.3.1-py3-none-any.whl.
File metadata
- Download URL: rtmdk-8.3.1-py3-none-any.whl
- Upload date:
- Size: 469.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
79529608dbd461d53ef81753bcea64712899a8aa9cdf7053c45ed3a512b47535
|
|
| MD5 |
32c9cf613f4c2b23ae9e40d8914231c8
|
|
| BLAKE2b-256 |
a108ebf09da26228a4e3809b5f0f8a9bcfcf99fff6cb6f12868de4b300ecf69c
|