Debug code generation models
Project description
CodeGaze: A library for evaluating and debugging code generation models
Code gaze implements a set of evaluation metrics and visualization tools for debugging code generation models.
CodeGaze is build around a set of abstractions that allow for the evaluation of code generation models.
- Dataset: An example code generation dataset e.g. humaneval.
- Experiment: A set of parameters that define the evaluation of a code generation model. Each experiment specifies things like the dataset, model properties (temperature, n_completions), and some metric properties.
- Model: A code generation model that can be evaluated. This is either an OpenAI model or a HuggingFace model.
The basic starting point is to run an experiment on a dataset with a list of models.
Installation
pip install codegaze
Install locally from this repo:
- Clone the repository
- Navigate to the directory and run
pip install -e .
- Run
codegaze ui --port 8080
to see debugging UI on port 8080.
Running Experiments
Scripts folder contains python scripts used for experimenets
run_gen.py
: given some experiment config (modify it in the file), generate completions for functional and blocksrun_eval.py
: compute metrics (function and block) based on generated completion data.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
codegaze-0.0.1a0.tar.gz
(11.7 kB
view hashes)
Built Distribution
codegaze-0.0.1a0-py3-none-any.whl
(12.7 kB
view hashes)
Close
Hashes for codegaze-0.0.1a0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | e92bb9af26073c018ae80008aa9c18b12717c0d36a94f4d17046d6ba8c0aa6f7 |
|
MD5 | 4deb72858b9c682b7a2cf2e35024df53 |
|
BLAKE2b-256 | 5a17a8bd6c0387d14c1fbd9dfa0bc6689d13bc417cdc956f82afc65f4ea019be |