Skip to main content

Framework for code synthesis and AI4SE research

Project description

Synthegrator

Synthegrator is a framework for code generation problems. It simplifies the process of loading common datasets and solving them with language models.

Installation

pip install synthegrator

Also, for execution you will need to install docker.

Example

Let's take a look at an example of how we can run a solver over the HumanEval dataset, which collects 164 function synthesis problems.

# Imports
from lmwrapper.openai_wrapper import get_open_ai_lm, OpenAiModelNames
from synthegrator.code_solver import LmCodeSolverAutoRegressive
from synthegrator.execution_threading import solve_and_evaluate_problems
from synthegrator.synthdatasets.human_eval import yield_human_eval
from synthegrator.df_converters import solution_evals_to_df

# Loading of a selection of AI4SE Datasets
problems = list(yield_human_eval())

# Create a solver that can solve a problem
lm = get_open_ai_lm(OpenAiModelNames.gpt_3_5_turbo_instruct)
#    ^ Make sure to add your API key to OPENAI_API_KEY or a file. 
#    See https://github.com/DaiseyCode/lmwrapper for more.
solver = LmCodeSolverAutoRegressive(lm)

# Generate code and execute problems testcases
evals = list(solve_and_evaluate_problems(
    solver=solver,
    problems=problems,
    max_threads_eval=4,
))
# Convert to a dataframe
df = solution_evals_to_df(
    evals, 
    pickle_gzip_whole_solution_eval=True
)
print("Fraction Passing", df.main_metric__is_success.mean())

Architecture

Guiding Design Requirements

  • DR-1 Support Diverse Datasets and Tasks. We want an architecture that can support a diverse tasks (including potentially complex, repository-level tasks).
  • DR-2 Consistent & Efficient Execution. Experiments often involve running LLM-generated code. We want this to be fast, efficient, and reasonably secure.
  • DR-3 Adaptable to State-of-the-Art Models. This includes models like those from OpenAI or on HuggingFace. Additionally be adaptable to models that might do complex retrieval or reasoning
  • DR-4 Maintainable. Try to follow best practices around automated testing and continuous integration.

Diagram

Alt synthegrator diagram

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

synthegrator-0.13.1.1.tar.gz (3.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

synthegrator-0.13.1.1-py3-none-any.whl (3.2 MB view details)

Uploaded Python 3

File details

Details for the file synthegrator-0.13.1.1.tar.gz.

File metadata

  • Download URL: synthegrator-0.13.1.1.tar.gz
  • Upload date:
  • Size: 3.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.10.4

File hashes

Hashes for synthegrator-0.13.1.1.tar.gz
Algorithm Hash digest
SHA256 403c0e3f3d8ca66f623847468c02a45352e61f7e68b9097b7ebc135839e70d90
MD5 4b563b4d7d425dc65c7709b13de2ccc5
BLAKE2b-256 773fb521b98c75e7af5aebf9fafd8f7709d4592e192a4843b54eb4e01ce87599

See more details on using hashes here.

File details

Details for the file synthegrator-0.13.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for synthegrator-0.13.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7b90aac372681f2c77b8c7b2acdef483d637762a891fcc2a848dc56eaa1c6f98
MD5 316f60932290cbc3e3b2892f6f83c5ba
BLAKE2b-256 fdecbb76b8c660c3100cca46764c7a9cba63a5fe5c19aec45a4fa43598fdcc3b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page