Multi-LLM agent framework with mem0-backed memory, llama-index RAG, MCP tool support, and reflection.

These details have not been verified by PyPI

Project links

Project description

DefenseAgent

English · 中文 README

A Python harness for building single-agent LLM applications. Define an agent in one YAML profile, instantiate it with one line of Python, run tasks against any of three execution strategies.

from DefenseAgent.agent import AgentConfig, ReActAgent
from DefenseAgent.examples import EXAMPLE_PROFILE_PATH

config = AgentConfig(profile=EXAMPLE_PROFILE_PATH)
agent  = ReActAgent(config)
result = await agent.run("Summarise today's plan in one sentence.")

Features

One-file agent definition. Identity, LLM provider, tools, memory, RAG, system prompt — all in one strictly-validated YAML (extra="forbid"; unknown fields raise ConfigValidationError on load).
Per-field configuration fallback. Every value can be set in the profile or in .env; profile wins per field, .env fills the gaps. Switch LLM providers (openai, anthropic, deepseek, qwen, google, vllm) without code changes.
Three agent strategies. SimpleAgent (one-shot), ReActAgent (tool-call loop), PlanAndSolveAgent (plan → execute → synthesise). All built from the same AgentConfig.
Three tool sources, one registry. Local skill directories (Anthropic-style SKILL.md bundles), MCP servers (stdio / SSE / WebSocket / streamable-http), Python functions (referenced from the profile by file path or dotted module).
Persistent memory with a built-in tool. mem0-backed Qdrant storage; agents automatically expose a memory_recall tool to the LLM. ContextCompressor keeps the working context within a configured token budget.
Optional RAG with a built-in tool. Drop documents into a directory, set rag.enabled: true, get a rag_search tool. Embedder credentials follow the same per-field profile→env fallback.
Multimodal input. agent.run(task, images=[...]) sends an OpenAI-style content-block message. Each image accepts a local file path, an http(s):// URL, or a data: URL. Supported on every OpenAI-compatible provider; the Anthropic adapter raises a clear LLMAdapterError if list content reaches it.
Dependency-injectable. LLM, memory, tools, reflector, compressor and logger are all replaceable in AgentConfig for tests and custom wiring.
Offline test suite. No network or external services required to run pytest.

Install

git clone https://github.com/yishu031031/DefenseAgent.git
cd DefenseAgent
conda create -n agent_lab python=3.12 -y
conda activate agent_lab
pip install -r requirements.txt

Configure

Create .env in the repo root. Minimum:

AGENT_LAB_LLM_PROVIDER=deepseek
DEEPSEEK_API_KEY=sk-…
DEEPSEEK_MODEL=deepseek-chat
DEEPSEEK_BASE_URL=https://api.deepseek.com/v1

EMBEDDING_API_KEY=sk-…
EMBEDDING_BASE_URL=https://api.openai.com/v1
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_DIMS=1536

TAVILY_API_KEY=…    # optional, used by scripts/react_tools_memory_demo.py

Resolution order, per field: profile YAML → env var → schema default. Whitespace-only values are treated as unset.

Providers and credentials

AGENT_LAB_LLM_PROVIDER selects the adapter. Each provider has its own block of <PROVIDER>_* env vars (<PROVIDER>_API_KEY, <PROVIDER>_MODEL, <PROVIDER>_BASE_URL). The cross-provider LLM_API_KEY / LLM_MODEL_ID / LLM_BASE_URL tier overrides the per-provider tier when set.

Provider	Adapter	Typical key format	Default base URL	Example chat models
`openai`	`OpenAICompatibleAdapter`	`sk-…` or `sk-proj-…`	`https://api.openai.com/v1`	`gpt-4o-mini`, `gpt-4o`, `o3-mini`
`anthropic`	`AnthropicAdapter`	`sk-ant-…`	`https://api.anthropic.com`	`claude-sonnet-4-6`, `claude-opus-4-7`
`deepseek`	`OpenAICompatibleAdapter`	`sk-…`	`https://api.deepseek.com/v1`	`deepseek-chat`, `deepseek-reasoner`
`qwen` (DashScope, OpenAI-compat)	`OpenAICompatibleAdapter`	`sk-…`	`https://dashscope.aliyuncs.com/compatible-mode/v1`	`qwen-plus`, `qwen-vl-max`, `qwen-vl-plus`
`google` (OpenAI-compat endpoint)	`OpenAICompatibleAdapter`	`sk-…`	`https://generativelanguage.googleapis.com/v1beta/openai`	`gemini-2.0-flash`
`vllm` (self-hosted)	`OpenAICompatibleAdapter`	any string (e.g. `EMPTY` / `token-not-needed`)	depends on deployment, e.g. `http://localhost:8000/v1`	whatever the vLLM server is serving

Embedding: a separate EMBEDDING_* block. Common pairings:

Embedder	`EMBEDDING_BASE_URL`	`EMBEDDING_MODEL`	`EMBEDDING_DIMS`
OpenAI	`https://api.openai.com/v1`	`text-embedding-3-small`	1536
OpenAI	`https://api.openai.com/v1`	`text-embedding-3-large`	3072
DashScope	`https://dashscope.aliyuncs.com/compatible-mode/v1`	`text-embedding-v3`	1024
ModelScope	`https://api-inference.modelscope.cn/v1`	`Qwen/Qwen3-Embedding-0.6B`	1024
ModelScope	`https://api-inference.modelscope.cn/v1`	`Qwen/Qwen3-Embedding-8B`	4096

EMBEDDING_DIMS must match what the model emits or the Qdrant collection rejects writes — set it from the model's documented vector size.

Quickstart

import asyncio
from DefenseAgent.agent import AgentConfig, ReActAgent
from DefenseAgent.examples import EXAMPLE_PROFILE_PATH

config = AgentConfig(profile=EXAMPLE_PROFILE_PATH)

async def main():
    async with ReActAgent(config) as agent:
        result = await agent.run("Summarise today's plan in one sentence.")
        print(result.final_answer)

asyncio.run(main())

End-to-end demo (calculator + Tavily web search + memory recall):

python scripts/react_tools_memory_demo.py

Building your own agent

Copy DefenseAgent/examples/example_agent/ (also available at runtime as EXAMPLE_AGENT_DIR in DefenseAgent.examples) to a new directory and edit profile.yaml. Each block under agent: is independent and optional except identity. All fields are validated by pydantic with extra="forbid".

`llm:`

llm:
  provider:           # str | null. One of: openai | anthropic | deepseek | qwen | google | vllm. Falls back to AGENT_LAB_LLM_PROVIDER.
  model:              # str | null. Provider-specific model id (see Providers table). Falls back to <PROVIDER>_MODEL or LLM_MODEL_ID.
  base_url:           # str | null. Provider endpoint. Falls back to <PROVIDER>_BASE_URL or LLM_BASE_URL.
  api_key:            # str | null. Falls back to <PROVIDER>_API_KEY. Recommend leaving blank in shared profiles.

All four fields are str | None. Each falls back to .env independently. Whitespace-only values count as unset, so a half-edited YAML can't shadow correct env state.

Identity (required)

id: "agent_001"     # str, min_length=1. Used as agent_id in mem0 + as the log file name.
name: "Nova Patel"  # str, min_length=1. The {name} placeholder.
age: 27             # int ≥ 0.
traits: "..."       # str, min_length=1. Free-form trait list.
backstory: "..."    # str, min_length=1.
initial_plan: "..." # str, min_length=1.

Every field is non-empty after stripping. All six are exposed as {id} {name} {age} {traits} {backstory} {initial_plan} placeholders in the prompt template.

`cognitive:`

cognitive:
  max_steps_per_cycle: 10     # int ≥ 1, default 10. Caps the ReAct tool-call loop per run().
  reflection_threshold: 5     # int ≥ 1, default 5. Unreflected-memory count that triggers Reflector.maybe_reflect().
  importance_threshold: 7     # float in [1, 10], default 7. Floor for "important" memories during reflection.
  planning_horizon: "1 day"   # str, min_length=1, default "1 day". Free-form; surfaced to the LLM in prompts.

`memory:`

memory:
  is_retrieve: true                       # bool, default true. Wires up the memory_recall tool.
  history_mode: add                       # 'add' | 'overwrite'. 'overwrite' enables diff/rollback.
  search_limit: 10                        # int ≥ 1, default 10. Max records returned per memory_recall call.
  ignore_roles: [tool, system]            # list[str], default ['tool', 'system']. Roles excluded from persistence.
  ignore_fields: [reasoning_content]      # list[str], default ['reasoning_content'].
  context_limit: 128000                   # int ≥ 1024, default 128000. Token budget before ContextCompressor prunes.
  prune_protect: 40000                    # int ≥ 0, default 40000. Tokens never touched during prune.
  prune_minimum: 20000                    # int ≥ 0, default 20000. Min tokens kept after prune.
  reserved_buffer: 20000                  # int ≥ 0, default 20000. Safety margin.
  enable_summary: true                    # bool, default true. Allow ContextCompressor to LLM-summarise old turns.
  storage_path:                           # str | null. Default: <profile_dir>/memory/.

mem0 + Qdrant on disk. Registers a memory_recall tool. ContextCompressor runs before each LLM turn.

`rag:`

rag:
  enabled: false                          # bool, default false. Flip to true to wire LlamaIndexRAG + rag_search.
  documents_dir: rag_corpus               # str | null. Relative to profile dir. Auto-indexed on first run().
  storage_dir: rag_index                  # str | null. Where the FAISS index is persisted.
  embedding_provider: openai              # 'openai' | 'huggingface', default 'openai'.
  embedding:                              # str | null. → EMBEDDING_MODEL.
  embedding_api_key:                      # str | null. → EMBEDDING_API_KEY.
  embedding_base_url:                     # str | null. → EMBEDDING_BASE_URL.
  embedding_dims:                         # int ≥ 1, null. → EMBEDDING_DIMS.
  chunk_size: 512                         # int ≥ 1, default 512. Tokens per chunk during ingestion.
  chunk_overlap: 50                       # int ≥ 0, default 50. Token overlap between adjacent chunks.
  top_k: 5                                # int ≥ 1, default 5. Default rag_search top_k.
  score_threshold: 0.0                    # float in [0.0, 1.0], default 0.0. Min score to return.
  retrieve_only: true                     # bool, default true. When false, RAG also synthesises an answer.
  use_huggingface: false                  # bool, default false. ms-agent's HF download path.

When enabled: true, registers a rag_search tool. Embedder fields use the same per-field profile→env fallback as llm:.

`tools:`

tools:
  skills:                                 # list[str]. Skill directory paths, relative to profile dir.
    - skills/tabular-report
  mcp:                                    # list[MCPServerConfig].
    - command: uvx                        # str | null. Required for stdio servers.
      args: [mcp-server-filesystem, /tmp] # list[str], default [].
      env: { TOKEN: "" }                  # dict[str,str] | null. Empty values interpolated from process env.
      cwd:                                # str | null. Optional working dir.
      include: [read_file]                # list[str]. Whitelist; mutually exclusive with `exclude`.
      exclude: []                         # list[str]. Blacklist.
    - transport: sse                      # 'stdio' | 'sse' | 'websocket' | 'streamable_http'.
      url: https://mcp.example.com/sse    # str | null. Required when transport != 'stdio'.
      headers: { Authorization: "..." }   # dict[str,str] | null.
      timeout: 30                         # float ≥ 0 | null. Connection timeout (seconds).
      sse_read_timeout: 300               # float ≥ 0 | null. SSE long-poll timeout.
  python:                                 # list[str]. Python entry-point strings.
    - python_tools/calc.py:calculator
    - my_pkg.search:web_search
  allow_skill_execution: false            # bool, default false. Opt-in to script execution.
  skill_execution_timeout: 300            # int ≥ 1, default 300. Subprocess timeout (seconds).

Each MCP entry must specify exactly one of command: (stdio) or url: (network). include and exclude are mutually exclusive per server.

Where to place a Python tool file

tools.python: accepts two forms:

1. Relative file path. Resolved against the profile's directory and loaded via importlib.util.spec_from_file_location. No sys.path setup needed.

DefenseAgent/examples/example_agent/
├── profile.yaml
├── python_tools/
│   └── calc.py            # def calculator(expression: str) -> str
└── skills/

Profile entry: python_tools/calc.py:calculator.

2. Dotted module path. The module must be importable from the running interpreter. Resolved via importlib.import_module.

my_pkg/
├── __init__.py
└── search.py              # def web_search(query: str) -> str

Profile entry: my_pkg.search:web_search.

For both forms, the function's type hints become the tool's input schema and its docstring becomes the tool description.

Custom tool in code (no profile entry)

def calculator(expression: str) -> str:
    """Evaluate an arithmetic expression."""
    ...

config = AgentConfig(profile="…", tools=[calculator])

Skill execution

allow_skill_execution: true registers each script bundled in a skill (scripts/*.py, *.sh, *.js) as a separate executable Tool, named <skill_name>__<script_stem>. Subprocess-based via SkillContainer with the inherited dangerous-pattern guard.

`prompt:`

prompt:
  path: prompts/system.md         # str | null. File relative to profile dir.
  system:                         # str | null. Inline alternative to `path:`.
  extra_instructions:             # str | null. Appended after the resolved identity.

Precedence: inline system: > path: > auto-built identity block. Available placeholders inside the template (rendered via str.format): {id} {name} {age} {traits} {backstory} {initial_plan}. A broken template falls back to the auto-built identity rather than crashing the run.

Built-in tools

In addition to anything you register under tools:, the agent automatically exposes these to the LLM:

Tool	When registered	Input schema	What it does
`memory_recall`	When `memory.is_retrieve: true`	`{query: string, top_k?: int (1–20, default 5)}`	Semantic search over mem0 records under this agent's `(user_id, agent_id, run_id)` filter. Returns up to top_k records as a `- [<memory_type>] <content>` bullet list.
`rag_search`	When `rag.enabled: true`	`{query: string, top_k?: int}`	Vector search over the RAG index. Returns ranked chunks above `score_threshold`.
`<skill>` (one per skill)	One per `tools.skills:` entry	`{file?: string}`	No `file` → returns the skill's SKILL.md body. With `file` → returns the named file from the skill directory. Path-escape-guarded.
`<skill>__<script>` (one per script)	When `allow_skill_execution: true`	`{args?: list[str], stdin?: string, timeout?: int}`	Runs the script as a subprocess via `SkillContainer`. Returns stdout + stderr + exit code rendered for the LLM.

Agent classes

Class	Behaviour	When to use
`SimpleAgent`	One LLM call per `run()`. No tool loop.	Chat-shaped agents, zero tool use.
`ReActAgent`	Tool-call loop. Stops when the LLM returns plain text or `max_steps` is hit.	Default for tool-using agents.
`PlanAndSolveAgent`	Plan → execute each step → synthesise.	Long-horizon tasks where up-front planning helps.

All three are constructed from the same AgentConfig and share BaseAgent's helpers.

agent.run(task, max_steps=None, images=None):

task: str — user request.
max_steps: int | None — overrides cognitive.max_steps_per_cycle for this call. Ignored by SimpleAgent.
images: list[str | Path] | None — see Multimodal input.

Return type: AgentResult.

@dataclass
class AgentResult:
    task: str                      # the original task string
    final_answer: str              # the LLM's final plain-text answer
    steps: list[AgentStep]         # full ReAct trace; one entry per event
    usage: TokenUsage              # aggregate token counts across the run
    stopped_reason: Literal["answered", "max_steps"] = "answered"

@dataclass
class AgentStep:
    index: int
    kind: Literal["plan", "tool_call", "tool_result", "answer"]
    content: str = ""              # for "answer" / "tool_call" steps: the LLM's text
    tool_calls: list[ToolCall] = ...    # for "tool_call": the requested calls
    tool_results: list[Message] = ...   # for "tool_result": one role='tool' Message per call
    usage: TokenUsage | None = None     # per-LLM-call token counts (None for tool_result steps)

Multimodal input

All three agents accept an optional images= argument on run():

from pathlib import Path

result = await agent.run(
    "What's in this image, and how does it compare to this URL?",
    images=[
        Path("./screenshot.png"),
        "https://example.com/photo.jpg",
    ],
)

When images is provided, the user turn is sent as an OpenAI content-block list:

[{"type": "text", "text": "<task>"},
 {"type": "image_url", "image_url": {"url": "<resolved-url-1>"}},
 {"type": "image_url", "image_url": {"url": "<resolved-url-2>"}}]

Each image entry can be:

Input	Behaviour
`Path` or local file path string	Read, base64-encoded, emitted as `data:<mime>;base64,…`. MIME inferred from extension; defaults to `image/png`.
`http://` or `https://` URL string	Passed through unchanged.
`data:` URL string	Passed through unchanged.

Provider compatibility:

OpenAI-compatible adapters (Qwen via DashScope, DeepSeek-VL, GLM, Kimi, vLLM serving multimodal models, OpenAI itself) consume the list-shape directly. Set llm.model: to a vision-capable model.
Anthropic adapter raises LLMAdapterError with an explicit message if list content arrives. The Message type already supports list content, so adding Claude vision later is a localised adapter change.

For ReActAgent, only the initial user turn carries images — subsequent tool-result messages stay text. For PlanAndSolveAgent, the Phase 1 plan message and every Phase 2 execute-step message carry the same images, so each phase can re-inspect the visual content.

Architecture

AgentConfig ── profile.yaml + .env
     │
     ▼
build_components_sync ── LLM, Memory, ToolRegistry, Reflector, Compressor, Logger
     │
     ▼
BaseAgent ◀──── ReActAgent | SimpleAgent | PlanAndSolveAgent
     │
     ▼
run(task) ──► AgentResult { final_answer, steps[], usage }

build_components_sync runs synchronously. MCP server connections and the optional RAG index are built lazily on the first run() call (they are async).

Module layout

Path	Contents
`DefenseAgent/config/profile.py`	`AgentProfile`, `LLMConfig`, `MemoryConfig`, `RAGConfig`, `ToolsConfig`, `MCPServerConfig`, `PromptConfig`
`DefenseAgent/llm/`	`LLM` facade, OpenAI-compatible + Anthropic adapters
`DefenseAgent/memory/`	mem0 memory + `ContextCompressor`
`DefenseAgent/tools/`	`ToolRegistry`, `MCPClient`
`DefenseAgent/skills/`	`SkillLoader`, `SkillContainer`, `to_tools()` adapter
`DefenseAgent/rag/`	`LlamaIndexRAG`, profile bridge
`DefenseAgent/reflection/`	`Reflector`
`DefenseAgent/agent/`	`BaseAgent`, `SimpleAgent`, `ReActAgent`, `PlanAndSolveAgent`, `AgentConfig`, `_builder`

The memory, MCP, skill and RAG components are subclasses of ms-agent's upstream classes.

Demos

python scripts/react_tools_memory_demo.py     # ReAct + calculator + Tavily + memory recall
python scripts/profile_chat_demo.py           # one-turn chat with the example profile
python scripts/tools_demo.py                  # walk the skill tool layers
python scripts/memory_demo.py                 # mem0 add / search / dump

Tests

pytest                       # full suite, offline
pytest -k tools              # one module
pytest -x --tb=short         # stop on first failure

531 tests, 3 skipped.

License

MIT.

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Apr 30, 2026

0.1.5

Apr 28, 2026

0.1.4

Apr 28, 2026

0.1.3

Apr 28, 2026

0.1.2

Apr 28, 2026

0.1.1

Apr 28, 2026

This version

0.1.0

Apr 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

defense_agent-0.1.0.tar.gz (148.7 kB view details)

Uploaded Apr 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

defense_agent-0.1.0-py3-none-any.whl (102.3 kB view details)

Uploaded Apr 28, 2026 Python 3

File details

Details for the file defense_agent-0.1.0.tar.gz.

File metadata

Download URL: defense_agent-0.1.0.tar.gz
Upload date: Apr 28, 2026
Size: 148.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for defense_agent-0.1.0.tar.gz
Algorithm	Hash digest
SHA256	`fdb3d9fad722d12d28ddf35a6e2f1364d07eb99c1a0aec01dad8a6fff4ec96f4`
MD5	`40c1b42dd83573e7a11a614bb5425a36`
BLAKE2b-256	`82d2323b4b7909968eaef7629e7d1424ae6127b0ca139c663cd61583c72a6967`

See more details on using hashes here.

File details

Details for the file defense_agent-0.1.0-py3-none-any.whl.

File metadata

Download URL: defense_agent-0.1.0-py3-none-any.whl
Upload date: Apr 28, 2026
Size: 102.3 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.13

File hashes

Hashes for defense_agent-0.1.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8c113002b0ada9e96cf91ae0e871ab4cb1cb13c2d25d12123dd3200137f92458`
MD5	`cd888f4daeb9299d650d5cc7a4ef2b7b`
BLAKE2b-256	`3a2f335517ab488f88e20fea567ede94a9e173ecb50fe6be571b2d8324f644ea`

See more details on using hashes here.

defense-agent 0.1.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

DefenseAgent

Features

Install

Configure

Providers and credentials

Quickstart

Building your own agent

llm:

Identity (required)

cognitive:

memory:

rag:

tools:

Where to place a Python tool file

Custom tool in code (no profile entry)

Skill execution

prompt:

Built-in tools

Agent classes

Multimodal input

Architecture

Module layout

Demos

Tests

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

`llm:`

`cognitive:`

`memory:`

`rag:`

`tools:`

`prompt:`