Vector Vein inspired agent framework with cycle runtime, tools and memory management

Project description

vv-agent

A lightweight agent framework extracted from VectorVein's production runtime. Cycle-based execution with pluggable LLM backends, tool dispatch, memory compression, and distributed scheduling.

Architecture

AgentRuntime
├── CycleRunner          # single LLM turn: context -> completion -> tool calls
├── ToolCallRunner       # tool dispatch, directive convergence (finish/wait_user/continue)
├── RuntimeHookManager   # before/after hooks for LLM, tool calls, memory compaction
├── MemoryManager        # automatic history compression when context exceeds threshold
└── ExecutionBackend     # cycle loop scheduling
    ├── InlineBackend    # synchronous (default)
    ├── ThreadBackend    # thread pool with futures
    └── CeleryBackend    # distributed, per-cycle Celery task dispatch

Core types live in vv_agent.types: AgentTask, AgentResult, Message, CycleRecord, ToolCall.

Task completion is tool-driven: the agent calls task_finish or ask_user to signal terminal states. No implicit "last message = answer" heuristics.

Setup

cp local_settings.example.py local_settings.py
# Fill in your API keys and endpoints in local_settings.py

uv sync --dev
uv run pytest

Quick Start

CLI

uv run vv-agent --prompt "Summarize this framework" --backend moonshot --model kimi-k2.5

# With per-cycle logging
uv run vv-agent --prompt "Summarize this framework" --backend moonshot --model kimi-k2.5 --verbose

CLI flags: --settings-file, --backend, --model, --verbose.

Programmatic

from vv_agent.config import build_openai_llm_from_local_settings
from vv_agent.runtime import AgentRuntime
from vv_agent.tools import build_default_registry
from vv_agent.types import AgentTask

llm, resolved = build_openai_llm_from_local_settings("local_settings.py", backend="moonshot", model="kimi-k2.5")
runtime = AgentRuntime(llm_client=llm, tool_registry=build_default_registry())

result = runtime.run(AgentTask(
    task_id="demo",
    model=resolved.model_id,
    system_prompt="You are a helpful assistant.",
    user_prompt="What is 1+1?",
))
print(result.status, result.final_answer)

SDK

from vv_agent.sdk import AgentSDKClient, AgentSDKOptions

client = AgentSDKClient(options=AgentSDKOptions(
    settings_file="local_settings.py",
    default_backend="moonshot",
    default_model="kimi-k2.5",
))
result = client.run("Explain Python's GIL in one sentence.")
print(result.final_answer)

SDK Workspace Override (Session/Task)

AgentSDKOptions.workspace is the SDK default workspace. You can override it per one-shot run, or bind a fixed workspace to a session.

Priority for workspace resolution is:

Explicit workspace passed to run(...) / query(...) / create_session(...)
AgentSDKOptions.workspace

from vv_agent.sdk import AgentSDKClient, AgentSDKOptions

client = AgentSDKClient(options=AgentSDKOptions(
    settings_file="local_settings.py",
    default_backend="moonshot",
    default_model="kimi-k2.5",
    workspace="./workspace/default",
))

# One-shot override: this run uses ./workspace/task-a
run = client.run(prompt="Create notes.md", workspace="./workspace/task-a")

# Session override: all turns in this session stay in ./workspace/session-b
session = client.create_session(workspace="./workspace/session-b")
session.prompt("Create todo.md")
session.follow_up("Append one more todo item")
session.continue_run()

Notes:

AgentSession.workspace is fixed at session creation time.
prompt()/continue_run()/follow_up() all execute in that same session workspace.
session.cancel() requests cancellation for the currently running prompt in that session.
Top-level SDK helpers vv_agent.sdk.run(...) and vv_agent.sdk.query(...) also accept workspace=....

Shell Runtime Configuration (Windows)

bash runtime defaults are a startup/session configuration, not tool-call arguments.

Global defaults: AgentSDKOptions.bash_shell, AgentSDKOptions.windows_shell_priority, AgentSDKOptions.bash_env
Per-agent override: AgentDefinition.bash_shell, AgentDefinition.windows_shell_priority, AgentDefinition.bash_env
Recommended Windows priority: ["git-bash", "powershell", "cmd"]
On Windows, bash-tool child processes default PYTHONUTF8=1 and PYTHONIOENCODING=utf-8 unless already overridden via the parent environment or bash_env.
run(...) and create_session(...) both inherit startup shell defaults.
The bash tool schema description includes a runtime shell hint (resolved shell kind + invocation prefix), so the model sees which shell command style is expected before calling the tool.
The runtime shell hint is frozen per task/session-run to keep tool schemas stable across cycles and preserve LLM prompt cache efficiency.

from vv_agent.sdk import AgentDefinition, AgentSDKClient, AgentSDKOptions

client = AgentSDKClient(
    options=AgentSDKOptions(
        settings_file="local_settings.py",
        default_backend="moonshot",
        windows_shell_priority=["git-bash", "powershell", "cmd"],
        bash_env={"PIP_INDEX_URL": "https://pypi.tuna.tsinghua.edu.cn/simple"},
    ),
    agents={
        "desktop": AgentDefinition(
            description="Desktop helper",
            model="kimi-k2.5",
            # Optional hard override for this agent only:
            bash_shell=None,
            bash_env={"HTTP_PROXY": "http://127.0.0.1:7890"},
        )
    },
)

Execution Backends

The cycle loop is delegated to a pluggable ExecutionBackend.

Backend	Use case
`InlineBackend`	Default. Synchronous, single-process.
`ThreadBackend`	Thread pool. Non-blocking `submit()` returns a `Future`.
`CeleryBackend`	Distributed. Each cycle dispatched as an independent Celery task.

CeleryBackend

Two modes:

Inline fallback (no RuntimeRecipe): cycles run in-process, same as InlineBackend.
Distributed (with RuntimeRecipe): each cycle is a Celery task. Workers rebuild the AgentRuntime from the recipe and load state from a shared StateStore (SQLite or Redis).

from vv_agent.runtime.backends.celery import CeleryBackend, RuntimeRecipe, register_cycle_task

register_cycle_task(celery_app)

recipe = RuntimeRecipe(
    settings_file="local_settings.py",
    backend="moonshot",
    model="kimi-k2.5",
    workspace="./workspace",
)
backend = CeleryBackend(celery_app=app, state_store=store, runtime_recipe=recipe)
runtime = AgentRuntime(llm_client=llm, tool_registry=registry, execution_backend=backend)

Install celery extras: uv sync --extra celery.

Cancellation and Streaming

from vv_agent.runtime import CancellationToken, ExecutionContext

# Cancel from another thread
token = CancellationToken()
ctx = ExecutionContext(cancellation_token=token)
result = runtime.run(task, ctx=ctx)

# Stream LLM output token by token
ctx = ExecutionContext(stream_callback=lambda text: print(text, end=""))
result = runtime.run(task, ctx=ctx)

Runtime Log Payloads

tool_result runtime events now carry full tool output in result/content by default (no implicit truncation). content_preview and assistant_preview are still emitted for UI convenience.

If you need shorter previews for logs/transport, configure an explicit preview limit:

from vv_agent.sdk import AgentSDKOptions

options = AgentSDKOptions(
    settings_file="local_settings.py",
    default_backend="moonshot",
    log_preview_chars=220,  # optional: enable preview truncation explicitly
)

Workspace Backends

Workspace file I/O is delegated to a pluggable WorkspaceBackend protocol. All built-in file tools (read_file, write_file, list_files, etc.) go through this abstraction.

list_files includes built-in safety defaults for large workspaces:

Returns at most 500 paths per call by default (max_results can tune this, with hard cap).
Uses ripgrep (rg) for fast local traversal when available, with automatic fallback to Python walk.
workspace_grep also uses rg for local workspaces (with Python fallback), defaults to smart-case matching (lowercase patterns are case-insensitive; patterns with uppercase stay case-sensitive), and skips hidden/common dependency roots unless explicitly included.
When listing from workspace root, common dependency/cache roots (for example node_modules, .venv, .git) are summarized instead of expanded.
You can still inspect those paths explicitly by setting path to that directory (or by setting include_ignored=true).
Supports scan_limit to stop early on very large trees; when triggered, response sets count_is_estimate=true.

Backend	Use case
`LocalWorkspaceBackend`	Default. Reads/writes to a local directory with path-escape protection.
`MemoryWorkspaceBackend`	Pure in-memory dict storage. Great for testing and sandboxed runs.
`S3WorkspaceBackend`	S3-compatible object storage (AWS S3, Aliyun OSS, MinIO, Cloudflare R2).

from vv_agent.workspace import LocalWorkspaceBackend, MemoryWorkspaceBackend

# Explicit local backend
runtime = AgentRuntime(
    llm_client=llm,
    tool_registry=registry,
    workspace_backend=LocalWorkspaceBackend(Path("./workspace")),
)

# In-memory backend for testing
runtime = AgentRuntime(
    llm_client=llm,
    tool_registry=registry,
    workspace_backend=MemoryWorkspaceBackend(),
)

S3WorkspaceBackend

Install the optional S3 dependency: uv pip install 'vv-agent[s3]'.

from vv_agent.workspace import S3WorkspaceBackend

backend = S3WorkspaceBackend(
    bucket="my-bucket",
    prefix="agent-workspace",
    endpoint_url="https://oss-cn-hangzhou.aliyuncs.com",  # or None for AWS
    aws_access_key_id="...",
    aws_secret_access_key="...",
    addressing_style="virtual",  # "path" for MinIO
)

Custom Backend

Implement the WorkspaceBackend protocol (8 methods) to plug in any storage:

from vv_agent.workspace import WorkspaceBackend

class MyBackend:
    def list_files(self, base: str, glob: str) -> list[str]: ...
    def read_text(self, path: str) -> str: ...
    def read_bytes(self, path: str) -> bytes: ...
    def write_text(self, path: str, content: str, *, append: bool = False) -> int: ...
    def file_info(self, path: str) -> FileInfo | None: ...
    def exists(self, path: str) -> bool: ...
    def is_file(self, path: str) -> bool: ...
    def mkdir(self, path: str) -> None: ...

Modules

Module	Description
`vv_agent.runtime.AgentRuntime`	Top-level state machine (completed / wait_user / max_cycles / failed)
`vv_agent.runtime.CycleRunner`	Single LLM turn and cycle record construction
`vv_agent.runtime.ToolCallRunner`	Tool execution with directive convergence
`vv_agent.runtime.RuntimeHookManager`	Hook dispatch (before/after LLM, tool call, memory compact)
`vv_agent.runtime.StateStore`	Checkpoint persistence protocol (`InMemoryStateStore` / `SqliteStateStore` / `RedisStateStore`)
`vv_agent.memory.MemoryManager`	Context compression when history exceeds threshold
`vv_agent.workspace`	Pluggable file storage: `LocalWorkspaceBackend`, `MemoryWorkspaceBackend`, `S3WorkspaceBackend`
`vv_agent.tools`	Built-in tools: workspace I/O, todo, bash, image, sub-agents, skills
`vv_agent.sdk`	High-level SDK: `AgentSDKClient`, `AgentSession`, `AgentResourceLoader`
`vv_agent.skills`	Agent Skills support (`SKILL.md` parsing, validation, unified normalization, prompt rendering with budget management, `activate_skill` tool)
`vv_agent.llm.VVLlmClient`	Unified LLM interface via `vv-llm` (endpoint rotation, retry, streaming)
`vv_agent.config`	Model/endpoint/key resolution from `local_settings.py`

Memory Compaction

MemoryManager compacts history when AgentTask.memory_compact_threshold is exceeded.

Task-level knobs:
- memory_compact_threshold (default 128000)
- memory_threshold_percentage (warning threshold percentage, default 90)
SDK mapping:
- AgentDefinition.memory_compact_threshold
- AgentDefinition.memory_threshold_percentage
- AgentSDKClient.prepare_task(...) 会把这两个字段透传到 AgentTask。
Effective-length strategy (backend-aligned):
- If previous cycle token usage exists:
  - effective_length = previous_total_tokens + len(json.dumps(recent_tool_messages))
- Otherwise fallback to:
  - len(json.dumps(messages[2:]))
Compaction pipeline:
1. Structural cleanup (stale tool calls, orphan tool messages, assistant-no-tool collapse, old tool result artifactization)
2. If still over threshold, generate compressed memory summary

Runtime metadata keys

Pass these via AgentTask.metadata:

memory_keep_recent_messages
include_memory_warning
tool_result_compact_threshold
tool_result_keep_last
tool_result_excerpt_head
tool_result_excerpt_tail
tool_calls_keep_last
assistant_no_tool_keep_last
tool_result_artifact_dir
summary_event_limit

Memory summary model selection priority

Priority is strict:

AgentTask.metadata
- memory_summary_backend / memory_summary_model
- aliases: compress_memory_summary_backend / compress_memory_summary_model
- aliases: memory_compress_backend / memory_compress_model
local_settings.py constants
- DEFAULT_USER_MEMORY_SUMMARIZE_BACKEND / DEFAULT_USER_MEMORY_SUMMARIZE_MODEL
- aliases: DEFAULT_MEMORY_SUMMARIZE_BACKEND / DEFAULT_MEMORY_SUMMARIZE_MODEL
- aliases: VV_AGENT_MEMORY_SUMMARY_BACKEND / VV_AGENT_MEMORY_SUMMARY_MODEL
Fallback
- runtime default_backend + current task model

Built-in Tools

list_files, file_info, read_file, write_file, file_str_replace, workspace_grep, compress_memory, todo_write, task_finish, ask_user, bash, read_image, create_sub_task, sub_task_status.

Custom tools can be registered via ToolRegistry.register().

Sub-agents

Configure named sub-agents on AgentTask.sub_agents. The parent agent delegates work via create_sub_task. Use task_description for one task, tasks for batch mode, and wait_for_completion=false to start background sub-tasks. Each sub-agent gets its own runtime, model, and tool set.

Each delegated sub-task now runs in a real AgentSession (session id defaults to the sub-task id). Tool payloads include session_id, and runtime events include stable identifiers (task_id / session_id) so host apps can subscribe, persist, and stream sub-task progress independently (including sub_agent_stream_delta token chunks).

Batch mode in create_sub_task dispatches valid sub-task items through the runtime execution backend's parallel_map, so synchronous batches run concurrently when the backend supports parallel execution.

Use sub_task_status to query sub-task states, inspect lightweight progress snapshots (detail_level=snapshot), or send follow-up messages to running/completed sub-tasks. When you run agents through AgentSDKClient.create_session(), the sub-task registry stays attached to that session, so later turns can still query background sub-tasks created earlier in the same session.

Sub-task runtime metadata now includes task_id, session_id, and browser_scope_key for each sub-agent run, so session-scoped tools (for example, browser controllers) stay isolated across parallel sub-tasks.

Host apps can interrupt a currently running sub-agent by calling vv_agent.runtime.engine.steer_sub_agent_session(session_id=..., prompt=...).

When a sub-agent uses a different model from the parent, the runtime needs settings_file and default_backend to resolve the LLM client.

Examples

24 numbered examples in examples/. See examples/README.md for the full list.

uv run python examples/01_quick_start.py
uv run python examples/24_workspace_backends.py

Testing

uv run pytest                              # unit tests (no network)
uv run ruff check .                        # lint
uv run ty check                            # type check

V_AGENT_RUN_LIVE_TESTS=1 uv run pytest -m live   # integration tests (needs real LLM)

Environment variables for live tests:

Variable	Default	Description
`V_AGENT_LOCAL_SETTINGS`	`local_settings.py`	Settings file path
`V_AGENT_LIVE_BACKEND`	`moonshot`	LLM backend
`V_AGENT_LIVE_MODEL`	`kimi-k2.5`	Model name
`V_AGENT_ENABLE_BASE64_KEY_DECODE`	-	Set `1` to enable base64 API key decoding

Project details

Release history Release notifications | RSS feed

0.1.62

May 5, 2026

0.1.61

Apr 24, 2026

0.1.60

Apr 11, 2026

0.1.59

Apr 2, 2026

0.1.58

Apr 1, 2026

0.1.57

Apr 1, 2026

0.1.56

Mar 23, 2026

0.1.55

Mar 21, 2026

0.1.53

Mar 20, 2026

0.1.51

Mar 20, 2026

This version

0.1.50

Mar 20, 2026

0.1.49

Mar 20, 2026

0.1.48

Mar 19, 2026

0.1.47

Mar 19, 2026

0.1.46

Mar 18, 2026

0.1.44

Mar 18, 2026

0.1.43

Mar 17, 2026

0.1.41

Mar 12, 2026

0.1.40

Mar 9, 2026

0.1.39

Mar 8, 2026

0.1.38

Mar 4, 2026

0.1.37

Mar 4, 2026

0.1.36

Mar 3, 2026

0.1.34

Mar 2, 2026

0.1.33

Mar 2, 2026

0.1.32

Mar 1, 2026

0.1.31

Feb 28, 2026

0.1.28

Feb 27, 2026

0.1.27

Feb 27, 2026

0.1.26

Feb 27, 2026

0.1.25

Feb 26, 2026

0.1.24

Feb 26, 2026

0.1.23

Feb 26, 2026

0.1.22

Feb 26, 2026

0.1.21

Feb 26, 2026

0.1.20

Feb 26, 2026

0.1.19

Feb 25, 2026

0.1.18

Feb 25, 2026

0.1.17

Feb 25, 2026

0.1.16

Feb 24, 2026

0.1.15

Feb 23, 2026

0.1.14

Feb 23, 2026

0.1.13

Feb 23, 2026

0.1.12

Feb 23, 2026

0.1.11

Feb 22, 2026

0.1.10

Feb 22, 2026

0.1.9

Feb 22, 2026

0.1.8

Feb 21, 2026

0.1.7

Feb 21, 2026

0.1.6

Feb 21, 2026

0.1.5

Feb 21, 2026

0.1.4

Feb 21, 2026

0.1.3

Feb 20, 2026

0.1.2

Feb 20, 2026

0.1.1

Feb 20, 2026

0.1.0

Feb 20, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vv_agent-0.1.50.tar.gz (106.7 kB view details)

Uploaded Mar 20, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

vv_agent-0.1.50-py3-none-any.whl (141.9 kB view details)

Uploaded Mar 20, 2026 Python 3

File details

Details for the file vv_agent-0.1.50.tar.gz.

File metadata

Download URL: vv_agent-0.1.50.tar.gz
Upload date: Mar 20, 2026
Size: 106.7 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vv_agent-0.1.50.tar.gz
Algorithm	Hash digest
SHA256	`7f8b298ad2a9ddc180ca095f832272405f03f55087fa62d7c16eee33eb91d914`
MD5	`22038add434ed99461803d092984b868`
BLAKE2b-256	`ec192849e2766d0eebb9608d26814bd468da6c336a0abee50fadbc5246dd3a88`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vv_agent-0.1.50.tar.gz:

Publisher: release.yml on AndersonBY/vv-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vv_agent-0.1.50.tar.gz
- Subject digest: 7f8b298ad2a9ddc180ca095f832272405f03f55087fa62d7c16eee33eb91d914
- Sigstore transparency entry: 1145741603
- Sigstore integration time: Mar 20, 2026
Source repository:
- Permalink: AndersonBY/vv-agent@5b20f493ea7779f79dc22807e2f4c953ef810140
- Branch / Tag: refs/tags/v0.1.50
- Owner: https://github.com/AndersonBY
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@5b20f493ea7779f79dc22807e2f4c953ef810140
- Trigger Event: push

File details

Details for the file vv_agent-0.1.50-py3-none-any.whl.

File metadata

Download URL: vv_agent-0.1.50-py3-none-any.whl
Upload date: Mar 20, 2026
Size: 141.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for vv_agent-0.1.50-py3-none-any.whl
Algorithm	Hash digest
SHA256	`df6c8c43d0838baad61dc5b7c32f7c9ba3d248391700ca112d37e9df375bf1ff`
MD5	`357a9714c784c83fbb8ff40fc369a1b3`
BLAKE2b-256	`e0d7888fcd8a10a062062f0baec50cd6a9a4e28e67ad8a1a597231d4c5def43f`

See more details on using hashes here.

Provenance

The following attestation bundles were made for vv_agent-0.1.50-py3-none-any.whl:

Publisher: release.yml on AndersonBY/vv-agent

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: vv_agent-0.1.50-py3-none-any.whl
- Subject digest: df6c8c43d0838baad61dc5b7c32f7c9ba3d248391700ca112d37e9df375bf1ff
- Sigstore transparency entry: 1145741658
- Sigstore integration time: Mar 20, 2026
Source repository:
- Permalink: AndersonBY/vv-agent@5b20f493ea7779f79dc22807e2f4c953ef810140
- Branch / Tag: refs/tags/v0.1.50
- Owner: https://github.com/AndersonBY
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@5b20f493ea7779f79dc22807e2f4c953ef810140
- Trigger Event: push

vv-agent 0.1.50

Navigation

Verified details

Maintainers

Unverified details

Meta

Project description

vv-agent

Architecture

Setup

Quick Start

CLI

Programmatic

SDK

SDK Workspace Override (Session/Task)

Shell Runtime Configuration (Windows)

Execution Backends

CeleryBackend

Cancellation and Streaming

Runtime Log Payloads

Workspace Backends

S3WorkspaceBackend

Custom Backend

Modules

Memory Compaction

Runtime metadata keys

Memory summary model selection priority

Built-in Tools

Sub-agents

Examples

Testing

Project details

Verified details

Maintainers

Unverified details

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance