Add your description here
Project description
Craftax LM
A wrapper around the Craftax agent benchmark, for evaluating digital agents.
Usage
First, download the package with pip install craftaxlm
. Next, import the agent-computer interface of your choice via
from craftaxlm import CraftaxACI, CraftaxClassicACI
This package is early in development, so for implementation examples, please refer to the baseline ReAct implementation
Leaderboard
Craftax-Classic
LM | Algorithm | Reward (% max) | Code |
---|---|---|---|
gpt-4o-mini | ReAct | 18.4 | CraftaxLM_Baselines |
Craftax-Full
LM | Algorithm | Reward (% max) | Code |
---|---|---|---|
gpt-4o-mini | ReAct | 02.9 | CraftaxLM_Baselines |
Dev Instructions
pyenv virtualenv craftax_env
poetry install
When in doubt
from jax import debug
...
debug.breakpoint()
📚 Citation
To learn more about Craftax, check out the paper website here. To cite the underlying Craftax environment, see:
@inproceedings{matthews2024craftax,
author={Michael Matthews and Michael Beukman and Benjamin Ellis and Mikayel Samvelyan and Matthew Jackson and Samuel Coward and Jakob Foerster},
title = {Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning},
booktitle = {International Conference on Machine Learning ({ICML})},
year = {2024}
}
To cite the Crafter benchmark, see:
@article{hafner2021crafter,
title={Benchmarking the Spectrum of Agent Capabilities},
author={Danijar Hafner},
year={2021},
journal={arXiv preprint arXiv:2109.06780},
}
Contributing
uv venv craftaxlm-dev
source craftaxlm-dev/bin/activate
uv run ruff format .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
craftaxlm-0.0.3.tar.gz
(64.6 kB
view hashes)
Built Distribution
craftaxlm-0.0.3-py3-none-any.whl
(17.4 kB
view hashes)
Close
Hashes for craftaxlm-0.0.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 99a8741f38fba2b657692f03cd3fe5814a0456668eef7aa03737c2b2c3cb9954 |
|
MD5 | a8beeab647d08f9e00bc253744eed199 |
|
BLAKE2b-256 | 1f423eb6b26665da6b24f63fb892346ad4126fadab7d86f52b60d57e58d72724 |