Skip to main content

Code RL

Project description

code-rl: Reinforcement Learning for Code Generation

Open In Colab GitHub stars GitHub forks Downloads

Overview

code-rl is a Python package designed to train language models in code generation through reinforcement learning methods. It seamlessly integrates with the OpenAI Gym environment, offering a structured framework to assess the quality of code generated by these models. As of version 0.0.3, code-rl exclusively supports the C programming language. Future iterations of the package aim to expand its capabilities to include support for Java and Golang.

Installation

To install code-rl, run the following command:

pip install code-rl

Ensure that you have Python 3.x installed before installation.

Getting Started

Prerequisites

System support: Linux and Mac.

You need to have gcc 6+ or clang 11+ preinstalled

Basic Usage

Provide a simple example of how to use your package. For example:

from coderl import CodeCompilerEnv
# dummy model
def model(prompt):
    return "int main(){return 0;}"
env = CodeCompilerEnv()
# Example of running a training episode
observation = "Give me a code to print hello world"
for _ in range(1000):
    action = model(observation)
    observation, reward, done, info = env.step(action)
    if done:
        break
    # Update your prompt based on the stderr/stdout provided at info
print(reward)
print(info)

API Reference

  • CodeCompilerEnv: The Gym environment for code evaluation.
    • reset(): Resets the environment to its initial state.
    • step(action): Executes an action in the environment.

Examples

from coderl import CodeCompilerEnv
env = CodeCompilerEnv()
code = """
#include<stdio.h>
int main(){
    printf("Hello World");
    return 0;
    }"""
result = env.step(code)
print(result) # (1, 1, True, {'stdout': 'Hello World'})
code = """
#include<stdio.h>
struct Point {
    int x, y;
};
int main(){
    if (1){
        printf("Hello World");
    }
    struct Point p = { 1 };  // Not initializing y will throw error in -Wextra
    printf("%d", p.x );
    return 0;
    }"""
result = env.step(code)
print(result) # (0, -1, True, {'stderr': 'temp_code.c: ...'})
code = """
#include<stdio.h>
int main(){
    int x; // unused variable -Wall will trigger warning
    printf("Hello World");
    return 0;
    }"""
result = env.step(code)
print(result) # (0, -2, True, {'stderr': 'temp_code.c:4:9: error: unused variable ‘x’  ...'})
code = """
#include<stdio.h>
void foo(void){
    return;
    }

int main(){
    printf("Hello World");
    foo();
    }
"""
result = env.step(code)
print(result) # (0, -2, True, {'stderr': 'temp_code.c:4:9: error: unused variable ‘x’  ...'})

Contributing

We warmly welcome contributions to code-rl and value your efforts to improve and expand this package. If you're interested in contributing, please start by forking the repository and submitting your changes through a pull request. We encourage you to adhere to established Python coding standards (PEP 8) for consistency. When submitting a pull request, please provide a clear description of the changes and any relevant issue numbers. We also recommend adding tests for new features to ensure reliability. For substantial changes, please open an issue first to discuss what you would like to change. Your contributions play a significant role in the development of code-rl, and we look forward to collaborating with the community!

License

MIT License

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

code_rl-0.0.9-py3-none-any.whl (10.6 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page