Code RL

These details have not been verified by PyPI

Project links

Homepage

Project description

code-rl: Reinforcement Learning for Code Generation

GitHub stars GitHub forks

Overview

code-rl is a Python package designed to train language models in code generation through reinforcement learning methods. It seamlessly integrates with the OpenAI Gym environment, offering a structured framework to assess the quality of code generated by these models. As of version 0.0.3, code-rl exclusively supports the C programming language. Future iterations of the package aim to expand its capabilities to include support for Java and Golang.

Installation

To install code-rl, run the following command:

pip install code-rl

Ensure that you have Python 3.x installed before installation.

Getting Started

Prerequisites

System support: Linux and Mac.

You need to have gcc 6+ or clang 11+ preinstalled

Basic Usage

Provide a simple example of how to use your package. For example:

from coderl import CodeCompilerEnv
# dummy model
def model(prompt):
    return "int main(){return 0;}"
env = CodeCompilerEnv()
# Example of running a training episode
observation = "Give me a code to print hello world"
for _ in range(1000):
    action = model(observation)
    observation, reward, done, info = env.step(action)
    if done:
        break
    # Update your prompt based on the stderr/stdout provided at info
print(reward)
print(info)

API Reference

CodeCompilerEnv: The Gym environment for code evaluation.
- reset(): Resets the environment to its initial state.
- step(action): Executes an action in the environment.

Examples

from coderl import CodeCompilerEnv
env = CodeCompilerEnv()
code = """
#include<stdio.h>
int main(){
    printf("Hello World");
    return 0;
    }"""
result = env.step(code)
print(result) # (1, 1, True, {'stdout': 'Hello World'})

code = """
#include<stdio.h>
struct Point {
    int x, y;
};
int main(){
    if (1){
        printf("Hello World");
    }
    struct Point p = { 1 };  // Not initializing y will throw error in -Wextra
    printf("%d", p.x );
    return 0;
    }"""
result = env.step(code)
print(result) # (0, -1, True, {'stderr': 'temp_code.c: ...'})

code = """
#include<stdio.h>
int main(){
    int x; // unused variable -Wall will trigger warning
    printf("Hello World");
    return 0;
    }"""
result = env.step(code)
print(result) # (0, -2, True, {'stderr': 'temp_code.c:4:9: error: unused variable ‘x’  ...'})

code = """
#include<stdio.h>
void foo(void){
    return;
    }

int main(){
    printf("Hello World");
    foo();
    }
"""
result = env.step(code)
print(result) # (0, -2, True, {'stderr': 'temp_code.c:4:9: error: unused variable ‘x’  ...'})

Contributing

We warmly welcome contributions to code-rl and value your efforts to improve and expand this package. If you're interested in contributing, please start by forking the repository and submitting your changes through a pull request. We encourage you to adhere to established Python coding standards (PEP 8) for consistency. When submitting a pull request, please provide a clear description of the changes and any relevant issue numbers. We also recommend adding tests for new features to ensure reliability. For substantial changes, please open an issue first to discuss what you would like to change. Your contributions play a significant role in the development of code-rl, and we look forward to collaborating with the community!

License

MIT License

Project details

These details have not been verified by PyPI

Project links

Homepage

Release history Release notifications | RSS feed

This version

0.0.9

Jan 25, 2024

0.0.8

Jan 25, 2024

0.0.7

Jan 25, 2024

0.0.6

Dec 25, 2023

0.0.4

Dec 23, 2023

0.0.3

Dec 23, 2023

0.0.2

Dec 22, 2023

0.0.1

Dec 22, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

code_rl-0.0.9-py3-none-any.whl (10.6 kB view hashes)

Uploaded Jan 25, 2024 Python 3

Hashes for code_rl-0.0.9-py3-none-any.whl

Hashes for code_rl-0.0.9-py3-none-any.whl
Algorithm	Hash digest
SHA256	`88cfc716dd43820e2bcf8f9cca09baedfabdf35611dd9bac7fb47d302ff6212b`
MD5	`c02feb468274c801d8d6279a93167bb4`
BLAKE2b-256	`295d307f65c3856d5ec3cef2016588efef84fb21a6940d98f64214876aaa77f9`