The official InterCode benchmark package - a framework for interactive code tasks
Project description
🤖💻 Intercode
Build interactive code environments for training, testing, and augmenting code and decision making agents
👋 Overview
InterCode is a lightweight, flexible, and easy-to-use framework for designing interactive code environments. Following the popular gym interface definition, InterCode makes it easy to quickly define a code environment and deploy an agent to operate in code within the context of the environment.
For an overview of InterCode, building interactive code tasks with InterCode, and evaluating agents on InterCode environments, please check out our wiki, website and the original paper:
InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback
🛠️ Installation
Note InterCode requires
python>= 3.8 a localdockerinstallation to run. Learn more here to install.
pip install intercode-bench
🚀 Quick Start
Bash
Create a python file and copy + paste the following code to interact with the InterCode Bash environment.
from intercode.assets import bash_build_docker, bash_image_name, bash_test_data
from intercode.envs import BashEnv
if __name__ == '__main__':
bash_build_docker()
env = BashEnv(bash_image_name, data_path=bash_test_data, traj_dir="logs/", verbose=True)
try:
for idx in range(3):
env.reset()
obs, done = env.observation, False
while not done:
action = input('> ')
obs, reward, done, info = env.step(action)
except KeyboardInterrupt:
print("Keyboard interrupt detected")
finally:
env.close()
If InterCode was installed successfully, the InterCode Bash environment should be started successfully and a CLI interpreter should appear, allowing you to enter bash commands to interact with the task setting's file system.
SQL
Create a python file and copy + paste the following code to interact with the InterCode SQL environment.
from intercode.assets import sql_build_docker, sql_image_name, sql_test_data
from intercode.envs import SqlEnv
from typing import Dict
def preprocess(record: Dict) -> str:
db = record["extra"]["db"]
return f"use {db}"
if __name__ == '__main__':
sql_build_docker()
env = SqlEnv(sql_image_name, data_path=sql_test_data, preprocess=preprocess, traj_dir="logs/", verbose=True)
try:
for idx in range(3):
env.reset()
obs, done = env.observation, False
while not done:
action = input('> ')
obs, reward, done, info = env.step(action)
except KeyboardInterrupt:
print("Keyboard interrupt detected")
finally:
env.close()
If InterCode was installed successfully, the InterCode SQL environment should be started successfully and a CLI interpreter should appear, allowing you to enter SQL commands to interact with the task setting's MySQL database.
🔎 Learn More
To learn more about the InterCode framework, please check out the website and GitHub repository
🪪 License
Check LICENSE.md
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file intercode-bench-0.1.21.tar.gz.
File metadata
- Download URL: intercode-bench-0.1.21.tar.gz
- Upload date:
- Size: 141.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
df02618b8dd3e92e6903d8d40162b8bfbaa31ada1278bfc61f0d88f0a95f2927
|
|
| MD5 |
49aa77a094f53a661ed35d90dccafbd9
|
|
| BLAKE2b-256 |
f4c4bdf0ca1cf226766a6c4ce6654d7ee25c9474d6626fff0ff4eff5c78eb306
|
File details
Details for the file intercode_bench-0.1.21-py3-none-any.whl.
File metadata
- Download URL: intercode_bench-0.1.21-py3-none-any.whl
- Upload date:
- Size: 143.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d68210db8add216398367d95fbb36ea55671cec2481eb22be2e389639236a022
|
|
| MD5 |
1e87417fd6b10b274798bb67fc3d9aac
|
|
| BLAKE2b-256 |
9fc99dc505a7400d84fe5263164ccca5de760f1d02fde5b827c761d5e059a830
|