One More Dimension — lightweight LLM agent framework with tool-calling, skills, and file tools
Project description
OMD — One More Dimension
A lightweight Python framework for building LLM agents with tool-calling support. OMD provides a unified OpenAI-compatible client, a ReAct-style agentic loop, and automatic JSON schema generation from plain Python functions.
Features
- Unified LLM client —
BaseClientabstraction for any OpenAI-compatible HTTP API; swap providers by implementing a single method. - ReAct agent —
BaseAgentruns an iterative tool-calling loop with configurable iteration limits and automatic final-answer forcing. - Tool system —
Tool.from_callable()converts a typed Python function (with Sphinx-style docstring) into an OpenAI tool definition automatically. - Skills — drop a
SKILL.mdinto a directory and the agent can pick it up automatically (LLM router) or load it on demand viaload_skilltool. - File tools — line-precise
read_fragment/read_lines/insert_lines/delete_lineswith optional workspace-root sandbox. - Machinery integrations (optional) — ready-made functions for Stability AI image generation, Gemini reference-guided images, and Veo video generation.
- Pydantic models —
ModelandToolare Pydantic v2 models with strict validation and clean serialization.
Installation
From PyPI (once published):
pip install omd
# or
uv add omd
From source:
git clone https://github.com/mrYush/omd.git
cd omd
uv pip install -e .
Quick Start
1. Call an LLM
from omd.clients import ApiBarClient, Model
client = ApiBarClient(
model=Model(
name="gpt-4",
url="https://api.openai.com/v1/chat/completions",
api_token="sk-...",
)
)
# or, if MODEL_NAME / MODEL_URL / MODEL_API_TOKEN are set in the environment:
client = ApiBarClient(model=Model.from_env())
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"},
]
response = client.call_model(messages=messages)
2. Define Tools from Functions
Tool.from_callable() extracts the function name, docstring, and type hints
to build an OpenAI-compatible JSON schema:
from omd.tools import Tool
def search(query: str, top_k: int = 5) -> list[str]:
"""Search the web for relevant information.
:param query: Search query.
:param top_k: Number of results to return.
:returns: List of search results.
"""
return [f"result for {query}"]
tool = Tool.from_callable(search)
3. Run an Agent
BaseAgent implements a ReAct-style agentic loop — it sends messages to the
LLM, executes any requested tool calls, appends the results, and repeats until
the model produces a final text answer or the iteration limit is reached.
from omd.agents import BaseAgent
from omd.clients import ApiBarClient, Model
from omd.tools import Tool
def search(query: str) -> str:
"""Search the web for information.
:param query: Search query.
:returns: Search results.
"""
return f"Results for: {query}"
def create_image(prompt: str) -> str:
"""Generate an image from a text description.
:param prompt: Image description.
:returns: Path to generated image.
"""
return f"/images/{prompt.replace(' ', '_')}.png"
client = ApiBarClient(
model=Model(
name="gpt-4",
url="https://api.openai.com/v1/chat/completions",
api_token="sk-...",
)
)
agent = BaseAgent(
client=client,
tools=[
Tool.from_callable(search),
Tool.from_callable(create_image),
],
tool_choice="auto",
)
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Search for 'Python async' and create an image of a snake"},
]
new_messages = agent.run(messages=messages, attempts_limit=10)
4. Manual Tool-Calling
If you need lower-level control, pass raw tool definitions directly to the client:
tools = [
{
"type": "function",
"function": {
"name": "search",
"description": "Search the web.",
"parameters": {
"type": "object",
"properties": {"query": {"type": "string"}},
"required": ["query"],
},
},
},
]
response = client.call_model(
messages=messages,
tools=tools,
tool_choice="auto",
parallel_tool_calls=True,
)
Agent Architecture
BaseAgent implements the ReAct (Reasoning + Acting) pattern:
┌─────────────────────────────────────────────────────────────┐
│ BaseAgent.run() │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────┐
│ 1. Send messages + tools → LLM │
└─────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────┐
│ LLM response │
└─────────────────┘
│
┌───────────────┴───────────────┐
│ │
▼ ▼
┌──────────────────┐ ┌──────────────────┐
│ Has tool_calls │ │ Final answer │
└──────────────────┘ └──────────────────┘
│ │
▼ ▼
┌──────────────────────────┐ ┌──────────────────┐
│ 2. Execute each tool │ │ Return new │
│ tool.invoke(args) │ │ messages │
└──────────────────────────┘ └──────────────────┘
│
▼
┌──────────────────────────┐
│ 3. Append results │
│ to messages │
└──────────────────────────┘
│
▼
┌──────────────────────────┐
│ 4. Loop back to step 1 │
└──────────────────────────┘
| Component | Role |
|---|---|
BaseClient |
Abstract HTTP call to an LLM API |
Tool |
Python function wrapper with JSON schema |
BaseAgent |
Agentic loop orchestrator |
messages |
Conversation history (mutated in-place) |
The attempts_limit parameter controls the maximum number of iterations.
On the final iteration tools are disabled (tools=None) to force the LLM
to produce a text answer and prevent infinite tool-call loops.
Skills
A skill is a folder containing a SKILL.md file with YAML-like
front-matter and a free-form Markdown body. The body explains how a particular
role uses the available tools — it is appended to the system prompt only when
the skill is selected.
SKILL.md format
---
name: file-editor
description: Edit text files by line range. Use when the user asks to read fragments or modify specific lines.
recommended_tools: [read_fragment, read_lines, insert_lines, delete_lines]
version: 0.1.0
---
# File Editor
Workflow:
1. Read the target range first.
2. Insert / delete with line-precise tools.
3. Re-read to verify.
Discovery
SkillRegistry discovers skills from three sources:
- Built-in skills shipped with the package (
omd/skills/builtin/):file-editor,code-reviewer,test-writer. Disable viainclude_builtin=False. - Directories listed in the
OMD_SKILLS_PATHenvironment variable (os.pathsep-separated). Disable viaread_env=False. - Directories passed to the
dirs=constructor argument (highest precedence).
Hybrid selection
The agent can use skills in two complementary ways:
- Auto-router — a small LLM call before the main loop picks the most
relevant skills based on the user's request and
name + descriptionmetadata; their bodies are prepended to the system prompt. - On-demand tools —
list_skillsandload_skilltools let the agent itself fetch a skill body mid-conversation when the router missed.
from omd.agents import BaseAgent
from omd.clients import ApiBarClient, Model
from omd.skills import SkillRegistry, SkillRouter
from omd.tools import make_file_tools
client = ApiBarClient(model=Model.from_env())
registry = SkillRegistry(dirs=["./my-skills"])
agent = BaseAgent(
client=client,
tools=make_file_tools(workspace_root="./project"),
skills=registry,
skill_router=SkillRouter(client, max_skills=2),
auto_skill_routing=True, # default — pre-select skills via router
expose_skill_tools=True, # default — also expose list_skills/load_skill
)
agent.run(
messages=[{"role": "user", "content": "Refactor utils.py and add tests"}],
attempts_limit=10,
)
File tools
omd.tools.make_file_tools() returns a ready-to-use set of line-precise
file manipulation tools:
| Tool | Purpose |
|---|---|
read_fragment(path, offset=1, limit=200) |
Read up to limit lines starting at offset |
read_lines(path, start, end) |
Read the inclusive 1-indexed line range |
insert_lines(path, line, content) |
Insert content before the given line |
delete_lines(path, start, end) |
Delete the inclusive 1-indexed range |
All paths are 1-indexed and the responses include LINE|content prefixes so
the model never has to count lines manually. Pass
workspace_root=Path("./project") to refuse any path that resolves outside
that directory; omit it for an unrestricted toolset.
from omd.tools import make_file_tools
tools = make_file_tools(workspace_root="./project")
The standalone callables (omd.tools.read_fragment, read_lines,
insert_lines, delete_lines) are also exported for direct use without an
agent.
Machinery (Optional Integrations)
The omd.machinery subpackage provides ready-made functions for external
media generation services. These are optional — the core agent/client/tool
system works without them.
Stability AI — Image Generation
from omd.machinery import generate_image
url = generate_image(
prompt="A serene mountain landscape at sunrise",
aspect_ratio="16:9",
output_format="png",
)
Image-to-image with strength control:
url = generate_image(
prompt="Transform into a watercolor painting",
image_path="/path/to/input.png",
strength=0.5,
)
Stability AI — Background Replacement
from omd.machinery import replace_background
url = replace_background(
subject_image_path="/path/to/person.png",
background_prompt="modern office with large windows and city view",
light_source_direction="right",
)
url = replace_background(
subject_image_path="/path/to/person.png",
background_reference_path="/path/to/beach_bg.jpg",
preserve_original_subject=0.9,
)
Gemini — Reference-Guided Image Generation
from omd.machinery import generate_image_gemini
url = generate_image_gemini(
prompt="A portrait in the style of the reference",
style_reference_image="/path/to/reference.png",
)
Veo — Video Generation
generate_video_veo produces short videos (4-8 seconds) with native audio via
Google Veo API. Supports text-to-video and image-to-video.
Under the hood the function:
- Submits a
predictLongRunningrequest to the Gemini API (via api-bar) - Polls the operation every 10 seconds until completion
- Downloads the generated MP4 video
- Uploads it to MinIO and returns a presigned URL
from omd.machinery import generate_video_veo
# Text-to-video
url = generate_video_veo(
prompt="A drone shot of a sunset over the ocean, cinematic, warm tones",
)
# Portrait video, Full HD
url = generate_video_veo(
prompt='Close-up of a barista pouring latte art. She says "Almost perfect".',
aspect_ratio="9:16",
resolution="1080p",
duration_seconds="8",
)
# Image-to-video (first frame from an image)
url = generate_video_veo(
prompt="The cat slowly opens its eyes and stretches",
reference_image="https://example.com/sleeping_cat.png",
)
Veo Model Selection
| Model | When to use |
|---|---|
veo-3.0-fast-generate-001 |
Default. Veo 3.0, cost-effective |
veo-3.0-generate-001 |
Veo 3.0, baseline quality |
veo-3.1-generate-preview |
Preview 3.1: highest quality |
veo-3.1-fast-generate-preview |
Preview 3.1: faster than full 3.1 |
veo-3.1-lite-generate-preview |
Preview 3.1: lightweight/cheapest in the 3.1 line |
Override per call with model="..." or change the default via the
_VEO_DEFAULT_MODEL constant in omd.machinery.veo.
Veo Parameters
| Parameter | Default | Description |
|---|---|---|
prompt |
(required) | Text description. Use quotes for dialogue, explicit words for sounds |
reference_image |
None |
Starting frame: file path, URL, or data URL |
aspect_ratio |
"16:9" |
"16:9" (landscape) or "9:16" (portrait) |
resolution |
"720p" |
"720p", "1080p", or "4k" |
duration_seconds |
"8" |
"4", "6", or "8" (1080p/4k require "8") |
model |
"veo-3.0-fast-generate-001" |
See model table above |
timeout_s |
600.0 |
Max wait time for generation (seconds) |
Prompt Tips
- Composition: subject, action, style, camera motion, ambiance
- Dialogue:
A man says "Let's go!" and grabs his coat - Sound:
thunder rumbling, rain hitting the window - Style:
cinematic,anime,stop-motion,film noir - Camera:
dolly shot,aerial view,close-up,POV shot
Configuration
Environment Variables
| Variable | Description | Required |
|---|---|---|
MODEL_URL |
Chat completions endpoint URL | Yes (for client) |
MODEL_NAME |
Model name (e.g. gpt-4, llama-3) |
Yes (for client) |
MODEL_API_TOKEN |
API key for authorization | Yes (for client) |
STABILITY_API_KEY |
Stability AI API key | For omd.machinery.sd |
STABILITY_BASE_URL |
Stability AI base URL | For omd.machinery.sd |
API_BAR_TOKEN |
Api-bar gateway token | For omd.machinery.gemini / veo |
MINIO_* |
MinIO connection settings | For media upload (image_utils, veo) |
MINIO_PRESIGNED_URL_EXPIRES_IN |
Presigned URL TTL in seconds (default: 3600) | No |
Project Structure
src/omd/
├── __init__.py
├── agents/
│ ├── __init__.py
│ └── base.py # BaseAgent — ReAct-style agentic loop with skills
├── clients/
│ ├── __init__.py
│ ├── base.py # BaseClient — abstract interface
│ ├── apibar.py # ApiBarClient — OpenAI-compatible implementation
│ └── data_models.py # Model — Pydantic client config (with from_env)
├── tools/
│ ├── __init__.py
│ ├── data_models.py # Tool — auto-generated JSON schemas from functions
│ └── file_tools.py # read_fragment / read_lines / insert / delete
├── skills/
│ ├── __init__.py
│ ├── data_models.py # Skill — Pydantic skill record
│ ├── parser.py # SKILL.md front-matter parser
│ ├── registry.py # SkillRegistry — discovery and lookup
│ ├── router.py # SkillRouter — LLM-driven selection
│ ├── tools.py # list_skills / load_skill agent tools
│ └── builtin/ # file-editor, code-reviewer, test-writer
├── machinery/ # Optional: Stability AI / Gemini / Veo
├── prompts/
│ └── base.py # PromptBase, ListPrompt — system prompt builders
└── utils/
└── logging_utils.py # truncate_text, serialize_for_log
Model.from_env() reads MODEL_NAME, MODEL_URL, and MODEL_API_TOKEN
lazily — import omd no longer requires any environment variables.
Development
Setup
git clone https://github.com/mrYush/omd.git
cd omd
uv sync --all-extras
Running Tests
uv run pytest
Linting
uv run ruff check src/ tests/
uv run ruff format --check src/ tests/
Roadmap
- Payload normalization across providers (
max_tokensvsmax_completion_tokens, etc.) - Legacy tool-calling schema support (
functions/function_call) - Retry logic with exponential backoff
- Streaming responses
- Additional client implementations (Anthropic, Google AI, etc.)
- Richer skill metadata (capabilities, dependencies, examples) and a CLI to scaffold/validate skill packages
License
MIT — Copyright (c) 2026 mrYush
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file omd-0.2.0.tar.gz.
File metadata
- Download URL: omd-0.2.0.tar.gz
- Upload date:
- Size: 48.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
564cbccf6c4b50c5aa8e53fa38889121c8b387bea53644e40bd54b2f69370167
|
|
| MD5 |
27781ced7419de4fa80f07edc84d3600
|
|
| BLAKE2b-256 |
98da55dc7615f38be35058eb916705abfed1bf5d201cacd3772ee4cdf28dfee9
|
File details
Details for the file omd-0.2.0-py3-none-any.whl.
File metadata
- Download URL: omd-0.2.0-py3-none-any.whl
- Upload date:
- Size: 57.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f82798e851dee0fdea054086bdf97848d22cd1d5622fa2a8c596ec47e0e5bc02
|
|
| MD5 |
7ba1346087b302052c6e6abe65c13cc4
|
|
| BLAKE2b-256 |
77f1ce42af049159e72a496176e0c24e9059b0101dd14010c429fc4d6212c9c3
|