Skip to main content

Agent framework for constructing language model agents and training on constructive tasks.

Project description

ldp

Agent framework for constructing language model agents and training on constructive tasks.

This repo models agent-environment interactions using a Partially Observable Markov Decision Process (POMDP). Inspired by POMDP, this repo's name ldp stands for Language Decision Processes.

Installation

To install ldp:

pip install -e .

If you plan to export Graphviz visualizations, make sure you also install the graphviz library into your OS via:

  • Linux: apt install graphviz
  • macOS: brew install graphviz

Agent/Policy

An agent should have two functions:

agent_state = await agent.init_state(tools=tools)
new_action, new_agent_state, value = await agent.get_asv(
    agent_state, obs
)

An agent should have a function get_asv(agent_state, obs) that chooses an action (a) from the observation messages, and returns the next agent state (s) and a value estimate (v). The first argument, agent_state, is a state specific for the agent that can be used for training from episodes. You can make it None if you aren't using it. It could contain things like agent memory.

The obs are not the complete list of observations, but rather the last list from env.step. The agent should keep track of observations via its state if it would like to keep them.

The value can be 0, it is the agent's estimate of the future rewards given its state and observations. This is used for training.

Generic Support

The Agent (as well as classes in agent.ops) are generics, which means:

  • Agent is designed to support arbitrary types
  • Subclasses can exactly specify state types, making the code more readable

If you are new to Python generics (typing.Generic), please read about them in Python typing.

Below is how to specify an agent with a custom state type.

from dataclasses import dataclass, field
from datetime import datetime

from ldp.agents import Agent


@dataclass
class MyComplexState:
    vector: list[float]
    timestamp: datetime = field(default_factory=datetime.now)


class MyAgent(Agent[MyComplexState]):
    """Some agent who is now type checked to match the custom state."""

Complete Example

from ldp.agent import SimpleAgent
from aviary.env import DummyEnv

env = DummyEnv()
agent = SimpleAgent()

obs, tools = await env.reset()
agent_state = await agent.init_state(tools=tools)

done = False
while not done:
    action, agent_state, _ = await agent.get_asv(agent_state, obs)
    obs, reward, done, truncated = await env.step(action.value)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ldp-0.13.0.tar.gz (352.9 kB view details)

Uploaded Source

Built Distribution

ldp-0.13.0-py3-none-any.whl (97.4 kB view details)

Uploaded Python 3

File details

Details for the file ldp-0.13.0.tar.gz.

File metadata

  • Download URL: ldp-0.13.0.tar.gz
  • Upload date:
  • Size: 352.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for ldp-0.13.0.tar.gz
Algorithm Hash digest
SHA256 0bc66f2a6c1de713a099e4b11bc2e12fe7f7c0ed5c2fb48e0bc2a5b6989e55e3
MD5 1d8fa9e259c55d4b99bd84911f2156d1
BLAKE2b-256 a7403d177d20d703c286076330bba8630025254db1d9d1e52d4e2dda6b470529

See more details on using hashes here.

File details

Details for the file ldp-0.13.0-py3-none-any.whl.

File metadata

  • Download URL: ldp-0.13.0-py3-none-any.whl
  • Upload date:
  • Size: 97.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for ldp-0.13.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d1c946096b0df66b1c7b00d8efdf8c9abee73079f2a071f4f32ebbe83525227e
MD5 f987e23945c33ff1ff4c94e17aba3f0e
BLAKE2b-256 6ff946a6791e7f1031b89881f59b331b61e376e04f37e41bcef15125907f0323

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page