Skip to main content

LLM Patch Driver is a framework for patching data objects using LLMs. It can generate and apply a single patch, or start the patching loop to fix complex validation issues.

Project description

LLM Patch Driver

TLDR

This library lets LLMs modify existing data objects (both structured and unstructured). It can generate and apply a single patch, or start the patching loop to fix complex validation issues.

This library is framework/client agnostic and allows you to add your own patch types and validation logic.

Useful when:

  1. You want to modify existing big document and you want to control what exactly should be changed.
  2. You want LLM to create an object from a huge Pydantic schema with recursions and custom validation (i.e., generate a valid DAG with 100+ nodes).
  3. Your prompt has too many instructions and conditions that must be met.
  4. Your desired response from LLM is too big for the output window.

Applications:

  1. Creating and updating graphs.
  2. Generating and updating long documents.
  3. Modifying codebases.
  4. Data extraction with complex schemas.

Components

Overview

The library includes the following components:

  • PatchTarget (wrapper for data you want to modify)
  • PatchDriver (handles LLM calls and generates patches)
  • BaseApiAdapter (formats core LLM objects for your LLM client)
  • BasePatch (stores both patch data and application methods)

PatchTarget

Description: PatchTarget is a wrapper around data objects that need to be modified. It acts as a smart container that holds your target content, validation logic, and patch application behavior. The component automatically creates backup copies, generates annotations for LLM understanding, and maintains lookup maps for efficient patching.

Key responsibilities:

  • Validates content against Pydantic schemas or custom validation functions
  • Annotates content with metadata to help LLMs understand structure
  • Stores backup copies for reset functionality
  • Applies patches to the wrapped object in-place
  • Tracks validation errors and debugging states

Examples:

from llm_patch_driver import PatchTarget, StrPatch, JsonPatch
from pydantic import BaseModel

# String patching with custom validation
def word_count_validator(text: str) -> str | None:
    if len(text.split()) != 500:
        return f"Text must be exactly 500 words, got {len(text.split())}"
    return None

string_target = PatchTarget(
    object="Your document content here...",
    validation_condition=word_count_validator,
    patch_type=StrPatch
)

# JSON patching with Pydantic schema validation
class Company(BaseModel):
    name: str
    employees: list[dict]

json_target = PatchTarget(
    object={"name": "Acme Corp", "employees": []},
    validation_schema=Company,
    patch_type=JsonPatch
)

# Nested object patching (content lives in an attribute)
@dataclass
class Document:
    content: str
    metadata: dict

doc_target = PatchTarget(
    object=Document(content="text to patch", metadata={}),
    content_attribute="content",  # Patch the 'content' attribute
    patch_type=StrPatch
)

PatchDriver

Description: PatchDriver is the orchestration engine that coordinates the entire patching process. It handles LLM communication, executes tool calls, and runs the iterative patching loop until validation succeeds or maximum cycles are reached. The driver is LLM-provider agnostic through the adapter pattern.

Key responsibilities:

  • Executes LLM calls using provided create/parse methods
  • Manages the autonomous patching loop with error handling
  • Supports both single patch generation and iterative fixing
  • Integrates with tool systems for LLM function calling
  • Provides debugging states and error tracking

Examples:

from llm_patch_driver import PatchDriver
from llm_patch_driver.llm import OpenAIChatCompletions
import openai

# OpenAI setup
client = openai.OpenAI()
driver = PatchDriver(
    target_object=your_patch_target,
    create_method=client.chat.completions.create,
    parse_method=client.beta.chat.completions.parse,
    model_args={'model': 'gpt-4o-mini'},
    api_adapter=OpenAIChatCompletions(),
    max_cycles=25
)

# Generate a single patch from a query
patch_bundle = await driver.request_patch_bundle(
    query="Fix the spelling errors in the document",
    context="This is a legal contract draft"
)

# Run autonomous patching loop until valid
messages = [{"role": "user", "content": "Generate a valid company structure..."}]
await driver.run_patching_loop(messages)

# Google Gemini setup
from google import genai
from llm_patch_driver.llm import GoogleGenAiAdapter

client = genai.Client(vertexai=True, project="your-project")
driver = PatchDriver(
    target_object=your_patch_target,
    create_method=client.models.generate_content,
    parse_method=client.models.generate_content,
    model_args={'model': 'gemini-2.5-pro'},
    api_adapter=GoogleGenAiAdapter()
)

BaseApiAdapter

Description: BaseApiAdapter provides a standardized interface for integrating different LLM providers. It abstracts away provider-specific API formats, message structures, and tool schemas, allowing PatchDriver to work seamlessly with OpenAI, Google, or any other LLM provider.

Prebuilt adapters: OpenAIChatCompletions, OpenAIResponses, GoogleGenAiAdapter

Key responsibilities:

  • Converts standardized inputs into provider-specific API call parameters
  • Parses provider responses back into standardized internal Message objects
  • Formats tool schemas according to each provider's function calling conventions
  • Handles provider-specific message formatting (roles, content structure, etc.)

Examples:

from llm_patch_driver.llm import OpenAIChatCompletions, GoogleGenAiAdapter
from llm_patch_driver.llm.schemas import Message

# OpenAI adapter usage
openai_adapter = OpenAIChatCompletions()

# Format messages for OpenAI API
api_params = openai_adapter.format_llm_call_input(
    messages=[Message(role="user", content="Fix this JSON")],
    tools=[tool_schema_dict],
    system_prompt="You are a helpful assistant"
)
# Returns: {"messages": [...], "tools": [...], "model": "gpt-4"}

# Parse OpenAI response back to standard format
response = openai_adapter.parse_llm_output(openai_raw_response)
# Returns: Message(role="assistant", content="...", tool_calls=[...])

# Google Gemini adapter usage  
google_adapter = GoogleGenAiAdapter()

# Same interface, different provider
api_params = google_adapter.format_llm_call_input(
    messages=[Message(role="user", content="Fix this JSON")],
    schema=YourPydanticModel
)
# Returns: {"contents": [...], "generation_config": {...}}

# Custom adapter implementation
class CustomAdapter(BaseApiAdapter):
    def format_llm_call_input(self, messages, tools=None, schema=None, system_prompt=None):
        # Convert to your provider's format
        return {"your_provider_format": "..."}
    
    def parse_llm_output(self, raw_response):
        # Convert back to Message object
        return Message(role="assistant", content="...")

BasePatch

Description: BasePatch is the abstract base class that defines how modifications are represented and applied to content. Concrete implementations handle different data types (strings, JSON) with their own patch formats and application logic. Each patch type includes specialized prompts and annotation systems optimized for LLM understanding.

Prebuilt patch types: StrPatch, JsonPatch

Key responsibilities:

  • Defines patch data structure and validation rules
  • Implements content-specific patch application logic
  • Builds lookup maps for efficient content navigation

Examples:

# String patches with sentence-level targeting
from llm_patch_driver.patch.string import StrPatch, ReplaceOp, DeleteOp, InsertAfterOp

# Replace operation targeting specific sentences
string_patch = StrPatch(
    tids=["1_2", "1_3"],  # Line 1, sentences 2 and 3
    operation=ReplaceOp(
        pattern="old text",
        replacement="new text"
    )
)

# Delete specific sentences
delete_patch = StrPatch(
    tids=["2_1"],  # Line 2, sentence 1
    operation=DeleteOp()
)

# Insert new content after a sentence
insert_patch = StrPatch(
    tids=["3_4"],  # After line 3, sentence 4
    operation=InsertAfterOp(text="This is new content to insert.")
)

# JSON patches using JSONPath-style operations
from llm_patch_driver.patch.json import JsonPatch

# Replace a JSON value
json_patch = JsonPatch(
    op="replace",
    a_id=5,  # Attribute ID from annotation map
    value="new value"
)

# Add to JSON array
array_patch = JsonPatch(
    op="add", 
    a_id=3,
    i_id=2,  # Index within array
    value={"new": "object"}
)

# Remove JSON property
remove_patch = JsonPatch(
    op="remove",
    a_id=7
)

# Custom patch implementation
class CustomPatch(BasePatch):
    @classmethod
    def get_bundle_schema(cls):
        # Define how patches are bundled together
        pass
        
    def apply_patch(self, patch_target):
        # Implement your patch application logic
        pass
        
    @classmethod  
    def build_map(cls, data):
        # Create lookup map for content navigation
        pass
        
    @classmethod
    def build_annotation(cls, data, map):
        # Generate annotated version for LLM prompts
        pass

Roadmap

  1. Replace spaCy sentencizer with a lightweight alternative
  2. Optimize content-map-annotation loop performance
  3. Add robust fallback handling for corrupted LLM patch responses
  4. Implement improved validation interface and error reporting

Contributing

Contributions are welcome! We're particularly interested in:

  • New LLM provider adapters (ApiAdapters)
  • Additional patch types for different data formats
  • Performance optimizations and bug fixes
  • Documentation improvements

Please feel free to open issues for feature requests or bug reports, and submit pull requests for any enhancements.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_patch_driver-0.1.0.tar.gz (5.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

llm_patch_driver-0.1.0-py3-none-any.whl (6.3 kB view details)

Uploaded Python 3

File details

Details for the file llm_patch_driver-0.1.0.tar.gz.

File metadata

  • Download URL: llm_patch_driver-0.1.0.tar.gz
  • Upload date:
  • Size: 5.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.8

File hashes

Hashes for llm_patch_driver-0.1.0.tar.gz
Algorithm Hash digest
SHA256 7a875f794b1529491ff560885b826d4dce714a8549d42dbb4c1d5edc3dfb8cca
MD5 1a3d37dd65d6c1f0cb4b09172487ecd2
BLAKE2b-256 3ec602bf3b248381eebc531b1eb7a6421e3c3aa1dfaa753beebe7018a3887ea7

See more details on using hashes here.

File details

Details for the file llm_patch_driver-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for llm_patch_driver-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 60d5579c4f29c3133a2f23f3d0e5ab7fecaa1fe01613b13b09550195567ff09a
MD5 3bdfb7d77e0863d7ee4296b475a6a1d3
BLAKE2b-256 59e1a98a2d9ab931a5e10f485b9cb8b649288f573f7cc80e29bbac930813e0e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page