Skip to main content

Gymnasium framework for training language model agents on constructive tasks

Project description

aviary

PyPI Version PyPI Python Versions License Tests

Gymnasium framework for training language model agents on constructive tasks.

Installation

To install aviary (note fh stands for FutureHouse):

pip install fhaviary

To install aviary with the bundled environments, please see the Environments section below.

Google Colab

As of 10/25/2024, unfortunately Google Colab does not yet support Python 3.11 or 3.12 (issue).

Thus, as a workaround, you will need to install Python 3.11 into your notebook. Here is a sample notebook showing how to do this: https://colab.research.google.com/drive/1mejZ5cxgKZrMpYEe0iRoanaGGQ0Cr6WI?usp=sharing

Also, note that async code works in Google Colab.

Developer Installation

For local development, please see the CONTRIBUTING.md.

Messages

Communication between the agent and environment is done through messages. Messages have two attributes:

msg = Message(content="Hello, world!", role="assistant")

For the meaning of role, see the table below. You can change around roles as desired, except for tool which has a special meaning in aviary.

Role Host Example
assistant Agent Tool selector agent with an environment
system Agent system prompt "You are an agent."
user Environment system prompt HotPotQA problem to solve
tool Tool run in the environment Some custom number crunching program

The content is a string that can be anything, or a null value.

Environment

An environment should have two functions:

obs_msgs, tools = await env.reset()
new_obs_msgs, reward, done, truncated = await env.step(action_msg)

where messages are how communication is passed. The action_msg should be ToolRequestMessage which is 1 or more calls to tools provided by the reset. The obs_msgs returned from the environment are ToolResponseMessage or other general messages that are observations. The reward is a scalar value. The done is a boolean value. The truncated is a boolean value.

Let's see a complete example for building an environment.

Environment subclass and state

First we define an environment by subclassing the Environment and defining a state. The state is all variables that change per step and we want to keep together. It will be accessible in your tools, so you can use it to store information that you want to persist between steps and between tools.

from pydantic import BaseModel
from aviary.core import Environment


class ExampleState(BaseModel):
    reward: float = 0
    done: bool = False


class ExampleEnv(Environment[ExampleState]):
    state: ExampleState

We do not have other variables aside from state for this environment. We could have things like configuration, a name, tasks, etc. attached to it.

Common environments

We expose a simple interface to some commonly-used environments that are included in the aviary codebase. You can instantiate one by referring to its name and passing keyword arguments:

from aviary.core import Environment

env = Environment.from_name(
    "calculator",
    problem_id="example-problem",
    problem="What is 2+3?",
    answer=5,
)

Included with some environments are collections of problems that define training or evaluation datasets. We refer to these as TaskDatasets, and expose them with a similar interface:

from aviary.core import TaskDataset

dataset = TaskDataset.from_name("hotpotqa", split="dev")

Tool

Now let's define our functions that will make up our tools. We'll just have one tool. Tools can optionally have their last argument be state which is the environment state. This is how you can access the state. This argument will not be exposed to the agent as a possible parameter and will be injected by the environment (if part of the function signature).

def print_story(story: str, state: ExampleState):
    """Print a story.

    Args:
        story: Story to print.
        state: Environment state (hidden from agent - can put this string to shutup linter).
    """
    print(story)
    state.reward = 1
    state.done = True

There is special syntax we use for defining a tool. The tool is built from the following parts of the function: its name, its arguments names, the arguments types, and the docstring. The docstring is parsed to get a description of the function and its arguments, so match the syntax carefully.

Setting the state.done = True is how we indicate completion. This example terminates immediately. You can use other ways to decide to terminate.

You can make the function async - the environment will account for that when the tool is called.

Advanced tool descriptions

We support more sophisticated signatures, for those who want to use them:

  • Multiline docstrings
  • Non-primitive type hints (e.g. type unions)
  • Default values
  • Exclusion of info below \f (see below)

If you have summary-level information that belongs in the docstring, but you don't want it part of the Tool.info.description, add a r prefix to the docstring and inject \f before the summary information to exclude. This convention was created by FastAPI (docs).

def print_story(story: str | bytes, state: ExampleState):
    r"""Print a story.

    Extra information that is part of the tool description.

    \f

    This sentence is excluded because it's an implementation detail.

    Args:
        story: Story to print, either as a string or bytes.
        state: Environment state.
    """
    print(story)
    state.reward = 1
    state.done = True

Environment reset method

Now we'll define the reset function which should set-up the tools, and return one or more initial observations and the tools. The reset function is async to allow for database interactions or HTTP requests.

from aviary.core import Message, Tool


async def reset(self):
    self.tools = [Tool.from_function(ExampleEnv.print_story)]
    start = Message(content="Write a 5 word story and call print")
    return [start], self.tools

Environment step method

Now we can define the step function which should take an action and return the next observation, reward, done, and if the episode was truncated.

from aviary.core import Message


async def step(self, action: Message):
    msgs = await self.exec_tool_calls(action, state=self.state)
    return msgs, self.state.reward, self.state.done, False

You will probably often use this specific syntax for calling the tools - calling exec_tool_calls with the action.

Environment export_frame method

Optionally, we can define a function to export a snapshot of the environment and its state for visualization or debugging purposes.

from aviary.core import Frame


def export_frame(self):
    return Frame(
        state={"done": self.state.done, "reward": self.state.reward},
        info={"tool_names": [t.info.name for t in self.tools]},
    )

View Environment Tools

If an environment can be instantiated without anything other than a task (i.e., it implements from_task), you can start a server to view its tools:

pip install fhaviary[server]
aviary tools [env name]

This will start a server that allows you to view the tools and call them, viewing the descriptions/types and output that an agent would see when using the tools.

Environments

Here are a few environments implemented with aviary:

Environment PyPI Extra README
GSM8k aviary.gsm8k fhaviary[gsm8k] README.md
HotPotQA aviary.hotpotqa fhaviary[hotpotqa] README.md
PaperQA paper-qa fhaviary[paperqa] README.md

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fhaviary-0.8.3.tar.gz (190.9 kB view details)

Uploaded Source

Built Distribution

fhaviary-0.8.3-py3-none-any.whl (42.0 kB view details)

Uploaded Python 3

File details

Details for the file fhaviary-0.8.3.tar.gz.

File metadata

  • Download URL: fhaviary-0.8.3.tar.gz
  • Upload date:
  • Size: 190.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fhaviary-0.8.3.tar.gz
Algorithm Hash digest
SHA256 9a7cd514bb534511dfc59d87c96078c77ce9638a3a2d0b0cf18807e10a136a1b
MD5 f6cbf7c0863977472fa72d73426fcc14
BLAKE2b-256 a40eb6d89a45039b747f1748625369beffdf41d2bacd9a60236b9d8cb5d4432d

See more details on using hashes here.

File details

Details for the file fhaviary-0.8.3-py3-none-any.whl.

File metadata

  • Download URL: fhaviary-0.8.3-py3-none-any.whl
  • Upload date:
  • Size: 42.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.7

File hashes

Hashes for fhaviary-0.8.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6f7ef352f805b89e9a950434d5df836af5c890b62129dcec05199e4ce8a6a955
MD5 d1cf8c45aa6bc5e4e3642b25c5c2782d
BLAKE2b-256 ba854750c29bd18035d31c6f2b74ba9306b4ca05da3c9fbc593e0be2521cce41

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page