Skip to main content

Agent Gym - Pytorch

Project description

Agent Gym

Agent Gym

Join our Discord Subscribe on YouTube Connect on LinkedIn Follow on X.com

Convert any model into a r1-like reasoning hyper-intelligent agent. Leverages TRL, Huggingface, and various other libraries. This is a work in progress. Our goal is to make it easy to train any model into a reasoning agent.

Installation

pip3 install -U agentgym

Usage

from agentgym.r1_pipeline import R1Pipeline, SFTConfig

r1_pipeline = R1Pipeline(sft_model="gpt2", sft_dataset="stanfordnlp/imdb", sft_args=SFTConfig(output_dir="/tmp"))

r1_pipeline.run()

Architecture

The architecture is as follows:

  • SFT: Supervised Fine-Tuning
  • GRPO: Generative Reinforcement Policy Optimization

-> model -> sft -> grpo -> model

graph TD;
    A[model] --> B[sft]
    B --> C[grpo]
    C --> D[reasoning model]

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

agentgym-0.0.2.tar.gz (8.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

agentgym-0.0.2-py3-none-any.whl (9.0 kB view details)

Uploaded Python 3

File details

Details for the file agentgym-0.0.2.tar.gz.

File metadata

  • Download URL: agentgym-0.0.2.tar.gz
  • Upload date:
  • Size: 8.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.8 Darwin/23.3.0

File hashes

Hashes for agentgym-0.0.2.tar.gz
Algorithm Hash digest
SHA256 1a62ef68d0173f749dbc1eaf6c7f1e35d38d6ad2c7bb80baa3be0d4df1dcd181
MD5 7e2704723c0f73cb577d66afe40841de
BLAKE2b-256 b92b279cbbe392b6608dbf251032593b7cac7a40f8c8661d1e4206ac3182abe2

See more details on using hashes here.

File details

Details for the file agentgym-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: agentgym-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 9.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.8 Darwin/23.3.0

File hashes

Hashes for agentgym-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8f987431f6429283e5345bcddf8d9b66afc5940c3c54f3b06639c65e9c7cb022
MD5 a3d52504fa03c6710d2bbe1024a2e5f9
BLAKE2b-256 670ae4b5639b379da18f947018aef7e31bd3ca0cd81735b1f73d4a77ce229f95

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page