Agent framework for constructing language model agents and training on constructive tasks.
Project description
ldp
Agent framework for constructing language model agents and training on constructive tasks.
This repo models agent-environment interactions using a
Partially Observable Markov Decision Process (POMDP).
Inspired by POMDP, this repo's name ldp
stands for Language Decision Processes.
Installation
To install ldp
:
pip install -e .
If you plan to export Graphviz visualizations,
make sure you also install the graphviz
library into your OS via:
- Linux:
apt install graphviz
- macOS:
brew install graphviz
Agent/Policy
An agent should have two functions:
agent_state = await agent.init_state(tools=tools)
new_action, new_agent_state, value = await agent.get_asv(
agent_state, obs
)
An agent should have a function get_asv(agent_state, obs)
that chooses an action (a
) from the observation messages,
and returns the next agent state (s
) and a value estimate (v
).
The first argument, agent_state
, is a state specific for the agent
that can be used for training from episodes.
You can make it None
if you aren't using it.
It could contain things like agent memory.
The obs
are not the complete list of observations, but rather the last list from env.step
.
The agent should keep track of observations via its state if it would like to keep them.
The value can be 0
,
it is the agent's estimate of the future rewards given its state and observations.
This is used for training.
Generic Support
The Agent
(as well as classes in agent.ops
)
are generics,
which means:
Agent
is designed to support arbitrary types- Subclasses can exactly specify state types, making the code more readable
If you are new to Python generics (typing.Generic
),
please read about them in Python typing.
Below is how to specify an agent with a custom state type.
from dataclasses import dataclass, field
from datetime import datetime
from ldp.agents import Agent
@dataclass
class MyComplexState:
vector: list[float]
timestamp: datetime = field(default_factory=datetime.now)
class MyAgent(Agent[MyComplexState]):
"""Some agent who is now type checked to match the custom state."""
Complete Example
from ldp.agent import SimpleAgent
from aviary.env import DummyEnv
env = DummyEnv()
agent = SimpleAgent()
obs, tools = await env.reset()
agent_state = await agent.init_state(tools=tools)
done = False
while not done:
action, agent_state, _ = await agent.get_asv(agent_state, obs)
obs, reward, done, truncated = await env.step(action.value)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file ldp-0.10.0.tar.gz
.
File metadata
- Download URL: ldp-0.10.0.tar.gz
- Upload date:
- Size: 312.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 904813df816d3adcff4a52e8d8eb511e11768c2c93fb76ce125c8d9c13d37a8a |
|
MD5 | 8565b2a89be834973771dcf4a9fec99b |
|
BLAKE2b-256 | e91bc248b383c8184f927511a7332a9b52ff879d3cb7b8efab4290a3e7155937 |
File details
Details for the file ldp-0.10.0-py3-none-any.whl
.
File metadata
- Download URL: ldp-0.10.0-py3-none-any.whl
- Upload date:
- Size: 92.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.20
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0b04ed795d4d1756f5a808aa6e498cac409db00b6a4b8d22b8bc013d5605c110 |
|
MD5 | 0a2012f2340777d005a94c7ed1035633 |
|
BLAKE2b-256 | 7affa6fb97d938b2d1b182aedce145e22b32599044d54d809b517728284d5cb8 |