One More Dimension — lightweight LLM agent framework with tool-calling, skills, and file tools

These details have not been verified by PyPI

Project links

Project description

OMD — One More Dimension

A lightweight Python framework for building LLM agents with tool-calling support. OMD provides a unified OpenAI-compatible client, a ReAct-style agentic loop, and automatic JSON schema generation from plain Python functions.

Features

Unified LLM client — BaseClient abstraction for any OpenAI-compatible HTTP API; swap providers by implementing a single method.
ReAct agent — BaseAgent runs an iterative tool-calling loop with configurable iteration limits and automatic final-answer forcing.
Tool system — Tool.from_callable() converts a typed Python function (with Sphinx-style docstring) into an OpenAI tool definition automatically.
Skills — drop a SKILL.md into a directory and the agent can pick it up automatically (LLM router) or load it on demand via load_skill tool.
File tools — line-precise read_fragment / read_lines / insert_lines / delete_lines with optional workspace-root sandbox.
Machinery integrations (optional) — ready-made functions for Stability AI image generation, Gemini reference-guided images, and Veo video generation.
Pydantic models — Model and Tool are Pydantic v2 models with strict validation and clean serialization.

Installation

From PyPI (once published):

pip install omd
# or
uv add omd

From source:

git clone https://github.com/mrYush/omd.git
cd omd
uv pip install -e .

Quick Start

1. Call an LLM

from omd.clients import ApiBarClient, Model

client = ApiBarClient(
    model=Model(
        name="gpt-4",
        url="https://api.openai.com/v1/chat/completions",
        api_token="sk-...",
    )
)

# or, if MODEL_NAME / MODEL_URL / MODEL_API_TOKEN are set in the environment:
client = ApiBarClient(model=Model.from_env())

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello!"},
]

response = client.call_model(messages=messages)

2. Define Tools from Functions

Tool.from_callable() extracts the function name, docstring, and type hints to build an OpenAI-compatible JSON schema:

from omd.tools import Tool


def search(query: str, top_k: int = 5) -> list[str]:
    """Search the web for relevant information.

    :param query: Search query.
    :param top_k: Number of results to return.
    :returns: List of search results.
    """
    return [f"result for {query}"]


tool = Tool.from_callable(search)

3. Run an Agent

BaseAgent implements a ReAct-style agentic loop — it sends messages to the LLM, executes any requested tool calls, appends the results, and repeats until the model produces a final text answer or the iteration limit is reached.

from omd.agents import BaseAgent
from omd.clients import ApiBarClient, Model
from omd.tools import Tool


def search(query: str) -> str:
    """Search the web for information.

    :param query: Search query.
    :returns: Search results.
    """
    return f"Results for: {query}"


def create_image(prompt: str) -> str:
    """Generate an image from a text description.

    :param prompt: Image description.
    :returns: Path to generated image.
    """
    return f"/images/{prompt.replace(' ', '_')}.png"


client = ApiBarClient(
    model=Model(
        name="gpt-4",
        url="https://api.openai.com/v1/chat/completions",
        api_token="sk-...",
    )
)

agent = BaseAgent(
    client=client,
    tools=[
        Tool.from_callable(search),
        Tool.from_callable(create_image),
    ],
    tool_choice="auto",
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Search for 'Python async' and create an image of a snake"},
]

new_messages = agent.run(messages=messages, attempts_limit=10)

4. Manual Tool-Calling

If you need lower-level control, pass raw tool definitions directly to the client:

tools = [
    {
        "type": "function",
        "function": {
            "name": "search",
            "description": "Search the web.",
            "parameters": {
                "type": "object",
                "properties": {"query": {"type": "string"}},
                "required": ["query"],
            },
        },
    },
]

response = client.call_model(
    messages=messages,
    tools=tools,
    tool_choice="auto",
    parallel_tool_calls=True,
)

Agent Architecture

BaseAgent implements the ReAct (Reasoning + Acting) pattern:

┌─────────────────────────────────────────────────────────────┐
│                       BaseAgent.run()                       │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│  1. Send messages + tools → LLM                             │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
                    ┌─────────────────┐
                    │  LLM response   │
                    └─────────────────┘
                              │
              ┌───────────────┴───────────────┐
              │                               │
              ▼                               ▼
    ┌──────────────────┐           ┌──────────────────┐
    │  Has tool_calls  │           │  Final answer    │
    └──────────────────┘           └──────────────────┘
              │                               │
              ▼                               ▼
┌──────────────────────────┐       ┌──────────────────┐
│ 2. Execute each tool     │       │ Return new       │
│    tool.invoke(args)     │       │ messages         │
└──────────────────────────┘       └──────────────────┘
              │
              ▼
┌──────────────────────────┐
│ 3. Append results        │
│    to messages           │
└──────────────────────────┘
              │
              ▼
┌──────────────────────────┐
│ 4. Loop back to step 1   │
└──────────────────────────┘

Component	Role
`BaseClient`	Abstract HTTP call to an LLM API
`Tool`	Python function wrapper with JSON schema
`BaseAgent`	Agentic loop orchestrator
`messages`	Conversation history (mutated in-place)

The attempts_limit parameter controls the maximum number of iterations. On the final iteration tools are disabled (tools=None) to force the LLM to produce a text answer and prevent infinite tool-call loops.

Skills

A skill is a folder containing a SKILL.md file with YAML-like front-matter and a free-form Markdown body. The body explains how a particular role uses the available tools — it is appended to the system prompt only when the skill is selected.

SKILL.md format

---
name: file-editor
description: Edit text files by line range. Use when the user asks to read fragments or modify specific lines.
recommended_tools: [read_fragment, read_lines, insert_lines, delete_lines]
version: 0.1.0
---
# File Editor

Workflow:
1. Read the target range first.
2. Insert / delete with line-precise tools.
3. Re-read to verify.

Discovery

SkillRegistry discovers skills from three sources:

Built-in skills shipped with the package (omd/skills/builtin/): file-editor, code-reviewer, test-writer. Disable via include_builtin=False.
Directories listed in the OMD_SKILLS_PATH environment variable (os.pathsep-separated). Disable via read_env=False.
Directories passed to the dirs= constructor argument (highest precedence).

Hybrid selection

The agent can use skills in two complementary ways:

Auto-router — a small LLM call before the main loop picks the most relevant skills based on the user's request and name + description metadata; their bodies are prepended to the system prompt.
On-demand tools — list_skills and load_skill tools let the agent itself fetch a skill body mid-conversation when the router missed.

from omd.agents import BaseAgent
from omd.clients import ApiBarClient, Model
from omd.skills import SkillRegistry, SkillRouter
from omd.tools import make_file_tools

client = ApiBarClient(model=Model.from_env())
registry = SkillRegistry(dirs=["./my-skills"])

agent = BaseAgent(
    client=client,
    tools=make_file_tools(workspace_root="./project"),
    skills=registry,
    skill_router=SkillRouter(client, max_skills=2),
    auto_skill_routing=True,    # default — pre-select skills via router
    expose_skill_tools=True,    # default — also expose list_skills/load_skill
)

agent.run(
    messages=[{"role": "user", "content": "Refactor utils.py and add tests"}],
    attempts_limit=10,
)

File tools

omd.tools.make_file_tools() returns a ready-to-use set of line-precise file manipulation tools:

Tool	Purpose
`read_fragment(path, offset=1, limit=200)`	Read up to `limit` lines starting at `offset`
`read_lines(path, start, end)`	Read the inclusive 1-indexed line range
`insert_lines(path, line, content)`	Insert `content` before the given line
`delete_lines(path, start, end)`	Delete the inclusive 1-indexed range

All paths are 1-indexed and the responses include LINE|content prefixes so the model never has to count lines manually. Pass workspace_root=Path("./project") to refuse any path that resolves outside that directory; omit it for an unrestricted toolset.

from omd.tools import make_file_tools

tools = make_file_tools(workspace_root="./project")

The standalone callables (omd.tools.read_fragment, read_lines, insert_lines, delete_lines) are also exported for direct use without an agent.

Machinery (Optional Integrations)

The omd.machinery subpackage provides ready-made functions for external media generation services. These are optional — the core agent/client/tool system works without them.

Stability AI — Image Generation

from omd.machinery import generate_image

url = generate_image(
    prompt="A serene mountain landscape at sunrise",
    aspect_ratio="16:9",
    output_format="png",
)

Image-to-image with strength control:

url = generate_image(
    prompt="Transform into a watercolor painting",
    image_path="/path/to/input.png",
    strength=0.5,
)

Stability AI — Background Replacement

from omd.machinery import replace_background

url = replace_background(
    subject_image_path="/path/to/person.png",
    background_prompt="modern office with large windows and city view",
    light_source_direction="right",
)

url = replace_background(
    subject_image_path="/path/to/person.png",
    background_reference_path="/path/to/beach_bg.jpg",
    preserve_original_subject=0.9,
)

Gemini — Reference-Guided Image Generation

from omd.machinery import generate_image_gemini

url = generate_image_gemini(
    prompt="A portrait in the style of the reference",
    style_reference_image="/path/to/reference.png",
)

Veo — Video Generation

generate_video_veo produces short videos (4-8 seconds) with native audio via Google Veo API. Supports text-to-video and image-to-video.

Under the hood the function:

Submits a predictLongRunning request to the Gemini API (via api-bar)
Polls the operation every 10 seconds until completion
Downloads the generated MP4 video
Uploads it to MinIO and returns a presigned URL

from omd.machinery import generate_video_veo

# Text-to-video
url = generate_video_veo(
    prompt="A drone shot of a sunset over the ocean, cinematic, warm tones",
)

# Portrait video, Full HD
url = generate_video_veo(
    prompt='Close-up of a barista pouring latte art. She says "Almost perfect".',
    aspect_ratio="9:16",
    resolution="1080p",
    duration_seconds="8",
)

# Image-to-video (first frame from an image)
url = generate_video_veo(
    prompt="The cat slowly opens its eyes and stretches",
    reference_image="https://example.com/sleeping_cat.png",
)

Veo Model Selection

Model	When to use
`veo-3.0-fast-generate-001`	Default. Veo 3.0, cost-effective
`veo-3.0-generate-001`	Veo 3.0, baseline quality
`veo-3.1-generate-preview`	Preview 3.1: highest quality
`veo-3.1-fast-generate-preview`	Preview 3.1: faster than full 3.1
`veo-3.1-lite-generate-preview`	Preview 3.1: lightweight/cheapest in the 3.1 line

Override per call with model="..." or change the default via the _VEO_DEFAULT_MODEL constant in omd.machinery.veo.

Veo Parameters

Parameter	Default	Description
`prompt`	(required)	Text description. Use quotes for dialogue, explicit words for sounds
`reference_image`	`None`	Starting frame: file path, URL, or data URL
`aspect_ratio`	`"16:9"`	`"16:9"` (landscape) or `"9:16"` (portrait)
`resolution`	`"720p"`	`"720p"`, `"1080p"`, or `"4k"`
`duration_seconds`	`"8"`	`"4"`, `"6"`, or `"8"` (1080p/4k require `"8"`)
`model`	`"veo-3.0-fast-generate-001"`	See model table above
`timeout_s`	`600.0`	Max wait time for generation (seconds)

Prompt Tips

Composition: subject, action, style, camera motion, ambiance
Dialogue: A man says "Let's go!" and grabs his coat
Sound: thunder rumbling, rain hitting the window
Style: cinematic, anime, stop-motion, film noir
Camera: dolly shot, aerial view, close-up, POV shot

Configuration

Environment Variables

Variable	Description	Required
`MODEL_URL`	Chat completions endpoint URL	Yes (for client)
`MODEL_NAME`	Model name (e.g. `gpt-4`, `llama-3`)	Yes (for client)
`MODEL_API_TOKEN`	API key for authorization	Yes (for client)
`STABILITY_API_KEY`	Stability AI API key	For `omd.machinery.sd`
`STABILITY_BASE_URL`	Stability AI base URL	For `omd.machinery.sd`
`API_BAR_TOKEN`	Api-bar gateway token	For `omd.machinery.gemini` / `veo`
`MINIO_*`	MinIO connection settings	For media upload (`image_utils`, `veo`)
`MINIO_PRESIGNED_URL_EXPIRES_IN`	Presigned URL TTL in seconds (default: 3600)	No

Project Structure

src/omd/
├── __init__.py
├── agents/
│   ├── __init__.py
│   └── base.py             # BaseAgent — ReAct-style agentic loop with skills
├── clients/
│   ├── __init__.py
│   ├── base.py             # BaseClient — abstract interface
│   ├── apibar.py           # ApiBarClient — OpenAI-compatible implementation
│   └── data_models.py      # Model — Pydantic client config (with from_env)
├── tools/
│   ├── __init__.py
│   ├── data_models.py      # Tool — auto-generated JSON schemas from functions
│   └── file_tools.py       # read_fragment / read_lines / insert / delete
├── skills/
│   ├── __init__.py
│   ├── data_models.py      # Skill — Pydantic skill record
│   ├── parser.py           # SKILL.md front-matter parser
│   ├── registry.py         # SkillRegistry — discovery and lookup
│   ├── router.py           # SkillRouter — LLM-driven selection
│   ├── tools.py            # list_skills / load_skill agent tools
│   └── builtin/            # file-editor, code-reviewer, test-writer
├── machinery/              # Optional: Stability AI / Gemini / Veo
├── prompts/
│   └── base.py             # PromptBase, ListPrompt — system prompt builders
└── utils/
    └── logging_utils.py    # truncate_text, serialize_for_log

Model.from_env() reads MODEL_NAME, MODEL_URL, and MODEL_API_TOKEN lazily — import omd no longer requires any environment variables.

Development

Setup

git clone https://github.com/mrYush/omd.git
cd omd
uv sync --all-extras

Running Tests

uv run pytest

Linting

uv run ruff check src/ tests/
uv run ruff format --check src/ tests/

Roadmap

Payload normalization across providers (max_tokens vs max_completion_tokens, etc.)
Legacy tool-calling schema support (functions / function_call)
Retry logic with exponential backoff
Streaming responses
Additional client implementations (Anthropic, Google AI, etc.)
Richer skill metadata (capabilities, dependencies, examples) and a CLI to scaffold/validate skill packages

License

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.3.0

Apr 23, 2026

This version

0.2.0

Apr 18, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

omd-0.2.0.tar.gz (48.1 kB view details)

Uploaded Apr 18, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

omd-0.2.0-py3-none-any.whl (57.9 kB view details)

Uploaded Apr 18, 2026 Python 3

File details

Details for the file omd-0.2.0.tar.gz.

File metadata

Download URL: omd-0.2.0.tar.gz
Upload date: Apr 18, 2026
Size: 48.1 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.1

File hashes

Hashes for omd-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`564cbccf6c4b50c5aa8e53fa38889121c8b387bea53644e40bd54b2f69370167`
MD5	`27781ced7419de4fa80f07edc84d3600`
BLAKE2b-256	`98da55dc7615f38be35058eb916705abfed1bf5d201cacd3772ee4cdf28dfee9`

See more details on using hashes here.

File details

Details for the file omd-0.2.0-py3-none-any.whl.

File metadata

Download URL: omd-0.2.0-py3-none-any.whl
Upload date: Apr 18, 2026
Size: 57.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.6.1

File hashes

Hashes for omd-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`f82798e851dee0fdea054086bdf97848d22cd1d5622fa2a8c596ec47e0e5bc02`
MD5	`7ba1346087b302052c6e6abe65c13cc4`
BLAKE2b-256	`77f1ce42af049159e72a496176e0c24e9059b0101dd14010c429fc4d6212c9c3`

See more details on using hashes here.

omd 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

OMD — One More Dimension

Features

Installation

Quick Start

1. Call an LLM

2. Define Tools from Functions

3. Run an Agent

4. Manual Tool-Calling

Agent Architecture

Skills

SKILL.md format

Discovery

Hybrid selection

File tools

Machinery (Optional Integrations)

Stability AI — Image Generation

Stability AI — Background Replacement

Gemini — Reference-Guided Image Generation

Veo — Video Generation

Veo Model Selection

Veo Parameters

Prompt Tips

Configuration

Environment Variables

Project Structure

Development

Setup

Running Tests

Linting

Roadmap

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes