A locally-first, multi-agent autonomous task execution framework with zero cloud dependencies.
Project description
SumoSpace
A locally-first, multi-agent autonomous task execution framework with zero cloud dependencies.
The 60-second pitch
SumoSpace is a locally-first autonomous task execution framework designed for deterministic control over complex workflows. It provides a multi-agent deliberation pipeline (Planner → Critic → Resolver) that safely interacts with your local filesystem and tools, with full memory and RAG capabilities. Built with privacy and speed in mind, it operates entirely on your local machine with zero cloud dependencies required.
pip install sumospace
from sumospace import SumoKernel, SumoSettings
import asyncio
async def main():
async with SumoKernel(SumoSettings(provider="ollama", model="phi3:mini")) as kernel:
trace = await kernel.run(
"Find all functions in ./src that have no docstrings and add them"
)
print(trace.final_answer)
asyncio.run(main())
Why SumoSpace
| Capability | SumoSpace | LangChain | LlamaIndex | AutoGPT |
|---|---|---|---|---|
| Local Inference | First-class | Bolted-on | Bolted-on | Secondary |
| Multi-user Isolation | Native (Scope) | Manual | Manual | None |
| Planning Safety | Committee | None | None | Prompt-based |
| Cloud Required | No | Optional | Optional | No |
| Tool Execution Safety | Sandbox/Blocked | Optional | Optional | None |
| Streaming | Native | Complex | Native | None |
| Configuration Depth | High | Extreme | High | Low |
| Learning Curve | Moderate | Steep | Moderate | Low |
| Community Size | Small | Massive | Large | Large |
Use SumoSpace when: You need autonomous, multi-agent planning and execution that runs entirely locally, interacting deeply with files, tools, and custom environments. Don't use SumoSpace when: You simply need to stitch together an enormous ecosystem of cloud integrations or prefer relying on prompt-engineered chains over defined agent topologies.
Installation
# Minimal (Ollama or cloud providers only)
pip install sumospace
# With local HuggingFace inference
pip install sumospace[local]
# With OpenTelemetry observability
pip install sumospace[telemetry]
# With cloud provider SDKs
pip install sumospace[cloud]
# With desktop automation tools
pip install sumospace[desktop]
# Everything
pip install sumospace[all]
| Provider | Python | RAM (Min) | GPU |
|---|---|---|---|
| Ollama | 3.10+ | 8GB | Recommended |
| HuggingFace | 3.10+ | 16GB | Optional (CUDA/MPS) |
| vLLM | 3.10+ | 32GB | Required (NVIDIA) |
| Cloud | 3.10+ | 2GB | Not Required |
Core Concepts
graph TD
Input[Task Input] --> Classifier[Classifier]
Classifier --> RAG[RAG + Web]
RAG --> Committee[Committee]
Committee --> Executor[Tool Executor]
Executor --> Synthesis[Synthesis]
Synthesis --> Output[Final Answer]
- Classifier: Identifies the intent of your task (coding, conversational, research) to intelligently toggle RAG, Web Search, or Committee deliberation.
- RAG + Web: Retrieves semantically relevant context from your ingested codebase and history, grounding the agents with accurate knowledge before planning.
- Committee: Planner, Critic, and Resolver agents deliberately analyze the request and negotiate a safe, actionable, multi-step execution plan.
- Tool Executor: Runs the approved steps against the host system (executing shell commands, patching files, reading web pages) while enforcing safety checks.
- Synthesis: Combines the original task intent, retrieved context, and tool output into a cohesive and complete final answer.
Provider Configuration
# Ollama (recommended for local development)
SumoSettings(provider="ollama", model="phi3:mini")
SumoSettings(provider="ollama", model="llama3:8b")
SumoSettings(provider="ollama", model="deepseek-coder:6.7b")
# HuggingFace (in-process, no server needed)
SumoSettings(provider="hf", model="microsoft/Phi-3-mini-4k-instruct")
SumoSettings(provider="hf", model="mistralai/Mistral-7B-Instruct-v0.2", hf_load_in_4bit=True)
# vLLM (production GPU server)
SumoSettings(provider="vllm", vllm_base_url="http://gpu-server:8000", model="deepseek-coder")
# Cloud (opt-in)
SumoSettings(provider="gemini", model="gemini-pro") # needs GOOGLE_API_KEY
SumoSettings(provider="openai", model="gpt-4o") # needs OPENAI_API_KEY
SumoSettings(provider="anthropic", model="claude-3-5-sonnet-20241022") # needs ANTHROPIC_API_KEY
export SUMO_PROVIDER=ollama
export SUMO_MODEL=phi3:mini
sumo run "your task here"
Inference Modes & Presets
| Preset | Description |
|---|---|
chat |
Direct conversation, no committee, no RAG. (--preset chat) |
chat-with-context |
Chat with codebase RAG enabled. (--preset chat-with-context) |
stateless |
Pure stateless single-turn inference, no memory. (--preset stateless) |
coding |
Full pipeline optimised for code tasks with tools. (--preset coding) |
research |
Planning + web search, no code execution. (--preset research) |
review |
Plan and critique only — never executes tools. (--preset review) |
SumoSettings.for_coding(provider="ollama", model="phi3:mini")
| Mode | Planner | Critic | Resolver | Use when |
|---|---|---|---|---|
full |
✓ | ✓ | ✓ | Default, safest |
plan_only |
✓ | ✗ | ✗ | Speed over safety |
critique_only |
✓ | ✓ | ✗ | Balanced |
disabled |
✗ | ✗ | ✗ | Chat, Q&A |
Tools
SumoSpace provides comprehensive built-in tools for agents to operate natively on your machine.
Filesystem
read_file: Read the contents of a file (e.g.path="./src/main.py")write_file: Write content to a file, creating directories as neededlist_directory: List files in a directory, optionally filteredsearch_files: Search for a pattern in filespatch_file: Apply a unified diff patch
Code & Shell
shell: Run a shell command with timeoutdependencies: Install, update, or inspect packages
Docker
docker: Run Docker CLI commands (build, run, exec, ps, compose)
Web & Desktop
web_search: Search the web using DuckDuckGo (no API key required)fetch_url: Fetch the text content of a web pagebrowser: Automate browser interactions (requiressumospace[desktop])
Creating a custom tool:
from sumospace.tools import BaseTool, ToolResult
from typing import ClassVar
class PostgresTool(BaseTool):
name = "postgres_query"
description = "Execute a read-only SQL query against PostgreSQL."
tags: ClassVar[list[str]] = ["database", "sql", "read"]
schema: ClassVar[dict] = {
"type": "object",
"properties": {
"query": {"type": "string", "description": "SQL SELECT statement"},
},
"required": ["query"],
}
async def run(self, query: str, **_) -> ToolResult:
# your implementation
...
Register it via entry points in pyproject.toml so SumoSpace loads it automatically:
[project.entry-points."sumospace.tools"]
postgres = "my_package.tools:PostgresTool"
Memory & RAG
from sumospace import SumoKernel, SumoSettings
async with SumoKernel(SumoSettings()) as kernel:
# Ingest your codebase once
await kernel.ingest("./src")
await kernel.ingest("./docs")
# Now all runs have codebase context
trace = await kernel.run("Find all authentication-related functions")
sumo ingest ./src
sumo ingest ./docs --recursive
sumo run "Explain the authentication flow"
Multimodal RAG
SumoSpace supports searching and retrieving across text, images, audio, and video files as first-class citizens. By default, these features are disabled to keep the module completely lightweight. All heavy machine learning dependencies (CLIP, Whisper, BLIP) are lazy-loaded—meaning if you don't use them, you don't pay any performance or memory cost.
Installation
To enable multimodal support, install the optional extra:
pip install sumospace[multimodal]
# Or for 4x faster audio transcription:
pip install sumospace[multimodal-fast]
Usage
Enable it in your settings:
from sumospace import SumoKernel, SumoSettings
settings = SumoSettings(multimodal_enabled=True)
async with SumoKernel(settings) as kernel:
# Ingests documents, images, audio, and video
await kernel.ingest_multimodal("./my_media_folder")
# Search with a text query -> Finds text, images (cross-modal), and audio transcripts
results = await kernel.search_multimodal("a picture of a cat")
# Search with an image -> Finds visually similar images and video frames
image_results = await kernel.search_multimodal("./query_image.jpg")
Or via CLI:
sumo ingest-all ./my_media_folder
sumo search "a picture of a cat"
sumo search ./query_image.jpg
Multi-User Deployment
from fastapi import FastAPI
from sumospace import SumoKernel, SumoSettings
from pydantic import BaseModel
app = FastAPI()
class RunRequest(BaseModel):
task: str
user_id: str
@app.post("/run")
async def run_task(request: RunRequest):
# Per-request kernel — proper isolation, no shared state
settings = SumoSettings.for_coding(
provider="ollama",
user_id=request.user_id,
scope_level="user",
)
async with SumoKernel(settings=settings) as kernel:
trace = await kernel.run(request.task)
return {
"answer": trace.final_answer,
"success": trace.success,
"steps": len(trace.step_traces),
}
By initializing SumoSettings with different scope_level and user_id values, agents execute within strict isolation boundaries preventing data crossover. Releasing the async with block ensures ChromaDB file locks are released securely.
Lifecycle Hooks
from sumospace.hooks import HookRegistry
hooks = HookRegistry()
@hooks.on("on_plan_approved")
async def require_approval(plan, verdict):
print(f"\nAgent wants to execute {len(plan.steps)} steps:")
for step in plan.steps:
print(f" {step.step_number}. [{step.tool}] {step.description}")
if input("\nApprove? [y/N]: ").strip().lower() != "y":
raise Exception("User rejected plan")
kernel = SumoKernel(settings=settings, hooks=hooks)
@hooks.on("on_task_complete")
async def notify_slack(trace):
status = "✅" if trace.success else "❌"
await slack_client.chat_postMessage(
channel="#ai-agent",
text=f"{status} Task complete in {trace.duration_ms:.0f}ms: {trace.final_answer[:200]}"
)
@hooks.on("on_task_complete")
def track_cost(trace):
metrics.increment("agent.tasks.total")
metrics.histogram("agent.tasks.duration_ms", trace.duration_ms)
metrics.increment(f"agent.tasks.intent.{trace.intent.value}")
Available hooks: on_run_start, on_run_complete, on_run_error, on_intent_classified, on_plan_generated, on_plan_approved, on_plan_rejected, on_step_start, on_step_complete, on_task_complete.
Streaming
from sumospace.kernel import StepTrace, ExecutionTrace, SynthesisChunk
async with SumoKernel(settings) as kernel:
async for event in kernel.stream_run("Refactor auth.py to use async/await"):
if isinstance(event, StepTrace):
status = "✓" if event.result.success else "✗"
print(f" [{status}] {event.tool}: {event.description}")
elif isinstance(event, SynthesisChunk):
print(event.token, end="", flush=True) # Real-time token output
elif isinstance(event, ExecutionTrace):
print(f"\n\nCompleted in {event.duration_ms:.0f}ms")
Observability
sumo logs list
sumo logs show e9f2a7a4
sumo logs search "refactor"
settings = SumoSettings(
telemetry_enabled=True,
telemetry_endpoint="http://jaeger:4317",
)
Configuration Reference
| Field | Type | Default | Env Var | Description |
|---|---|---|---|---|
provider |
str |
"hf" |
SUMO_PROVIDER |
Inference provider (ollama, hf, vllm, etc.) |
model |
str |
"default" |
SUMO_MODEL |
Model identifier |
embedding_provider |
str |
"local" |
SUMO_EMBEDDING_PROVIDER |
Provider for embeddings |
embedding_model |
str |
"BAAI/bge-base-en-v1.5" |
SUMO_EMBEDDING_MODEL |
Embedding model |
require_consensus |
bool |
True |
SUMO_REQUIRE_CONSENSUS |
Require committee consensus |
committee_enabled |
bool |
True |
SUMO_COMMITTEE_ENABLED |
Enable multi-agent deliberation |
committee_mode |
Literal["full", "plan_only", "critique_only"] |
"full" |
SUMO_COMMITTEE_MODE |
Controls which committee agents run |
committee_temperature |
float |
0.1 |
SUMO_COMMITTEE_TEMPERATURE |
Planner temperature |
committee_max_tokens |
int |
2048 |
SUMO_COMMITTEE_MAX_TOKENS |
Max tokens for planning |
execution_enabled |
bool |
True |
SUMO_EXECUTION_ENABLED |
Allow tools to execute |
rag_enabled |
bool |
True |
SUMO_RAG_ENABLED |
Enable vector store retrieval |
rag_top_k_final |
int |
5 |
SUMO_RAG_TOP_K_FINAL |
Number of chunks to return |
memory_enabled |
bool |
True |
SUMO_MEMORY_ENABLED |
Enable episodic memory read and write |
shell_sandbox |
bool |
True |
SUMO_SHELL_SANDBOX |
Use sandbox for shell tools |
max_retries |
int |
3 |
SUMO_MAX_RETRIES |
Max retries for failed tool calls |
execution_timeout |
int |
120 |
SUMO_EXECUTION_TIMEOUT |
Timeout for tool execution |
verbose |
bool |
True |
SUMO_VERBOSE |
Enable detailed logging |
dry_run |
bool |
False |
SUMO_DRY_RUN |
Simulate execution |
hf_load_in_4bit |
bool |
False |
SUMO_HF_LOAD_IN_4BIT |
Load HF models in 4-bit quantization |
secondary_provider |
Optional[str] |
None |
SUMO_SECONDARY_PROVIDER |
Fallback provider |
secondary_model |
Optional[str] |
None |
SUMO_SECONDARY_MODEL |
Fallback model |
workspace |
str |
"." |
SUMO_WORKSPACE |
Working directory |
scope_level |
str |
"user" |
SUMO_SCOPE_LEVEL |
Multi-tenant scope level |
user_id |
str |
"" |
SUMO_USER_ID |
Identifier for user scope |
session_id |
str |
"" |
SUMO_SESSION_ID |
Identifier for session scope |
project_id |
str |
"" |
SUMO_PROJECT_ID |
SUMO_PROJECT_ID |
chroma_base |
str |
".sumo_db" |
SUMO_CHROMA_BASE |
Directory for ChromaDB |
max_chunks_per_scope |
Optional[int] |
None |
SUMO_MAX_CHUNKS_PER_SCOPE |
RAG limits |
prompt_template_path |
Optional[str] |
None |
SUMO_PROMPT_TEMPLATE_PATH |
Directory containing custom prompt .txt files |
auto_load_hooks |
bool |
False |
SUMO_AUTO_LOAD_HOOKS |
Automatically load hooks from .sumo_hooks.py |
hooks_module |
Optional[str] |
None |
SUMO_HOOKS_MODULE |
Path or dotted module to load hooks from |
telemetry_enabled |
bool |
False |
SUMO_TELEMETRY_ENABLED |
Export spans via OpenTelemetry |
telemetry_endpoint |
str |
"http://localhost:4317" |
SUMO_TELEMETRY_ENDPOINT |
OTLP endpoint |
CLI Reference
sumo run <task> [--provider] [--model] [--preset] [--no-committee]
[--plan-only] [--no-rag] [--dry-run] [--verbose]
sumo ingest <path> [--recursive] [--force] [--provider]
sumo watch <path> <task> [--debounce] [--ext] [--provider]
sumo logs list [--last N] [--failed]
sumo logs show <id>
sumo logs search <query>
sumo logs export <id>
sumo logs stats
sumo replay <session-id>
Architecture
sumospace/kernel.py— SumoKernel — main orchestratorsettings.py— SumoSettings — all configurationcommittee.py— PlannerAgent, CriticAgent, ResolverAgentproviders.py— ProviderRouter, BaseProvider, all providerstools.py— BaseTool, ToolRegistry, all built-in toolsclassifier.py— RuleBasedClassifier, LLMClassifierrag.py— RAGEngine, retrieval and rerankingmemory.py— MemoryManager, working + episodic memoryingest.py— UniversalIngestor, file loaders, chunkingscope.py— ScopeManager, multi-tenant isolationaudit.py— AuditLogger, session persistence, statstelemetry.py— SumoTelemetry, OpenTelemetry integrationhooks.py— HookRegistry, lifecycle eventstemplates.py— TemplateManager, prompt customizationcli.py— Typer CLI applicationexceptions.py— SumoSpaceError hierarchy
Contributing
We welcome contributions! See CONTRIBUTING.md for details.
- Fork the repo.
pip install -e ".[dev]"- Add your tool (subclass
BaseTool) or provider (subclassBaseProvider). pytest tests/(requires 75%+ coverage).- Submit a PR.
Roadmap
v0.2 — Ecosystem [ ] Plugin entry point marketplace [ ] MkDocs hosted API reference [ ] LangChain tool adapter (use LC tools in SumoSpace) [ ] Jupyter notebook integration
v0.3 — Scale
[ ] Distributed task queue (Celery/Redis backend)
[ ] Multi-modal tool support (image input/output)
[ ] Agent-to-agent communication (nested kernels)
[ ] Web UI dashboard for sumo logs
License
MIT. See LICENSE for details.
Acknowledgements
Built on top of ChromaDB, HuggingFace Transformers, Ollama, vLLM, Pydantic, Typer, Rich, and OpenTelemetry.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sumospace-0.3.3.tar.gz.
File metadata
- Download URL: sumospace-0.3.3.tar.gz
- Upload date:
- Size: 183.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
80004caffc33fdea6b4ab97e45381af5651a4784e10189c1f70d28c1fdb8dc74
|
|
| MD5 |
1403db93a260bc37d6554677c3938bdf
|
|
| BLAKE2b-256 |
b6e568b1b405de90a62ecc08177495f0e78a3ef404e7fd9480767d32cd118ba0
|
File details
Details for the file sumospace-0.3.3-py3-none-any.whl.
File metadata
- Download URL: sumospace-0.3.3-py3-none-any.whl
- Upload date:
- Size: 158.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.11.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
23871ae356d4f0bb81a6a8aedaae3b9b85a50da9cd5d7594fbc7c1405839a659
|
|
| MD5 |
29731ce04ead458ffeba02ef0114aebf
|
|
| BLAKE2b-256 |
9ae65ae610f30d093ce31ba2f78b358e2ff2f75a207f9c54bf85c2ca0168cd2b
|