Skip to main content

Generate code from unit-tests

Project description

Unvibe: Generate code that passes unit-tests

Unvibe quickly generates many alternative implementations for functions and classes you annotate with @ai, and re-runs your unit-tests until it finds a correct implementation.

The algorithm tries

This approach has been demonstrated in research and in practice to produce much better results than simply using code-generation alone (see Research Chapter).

Install

Just add unvibe as a dependency to your project:

pip install unvibe

Example

First define a new function in your existing Python project. Then annotate it with @ai: Let's suppose this is in lisp.py:

from unvibe import ai


@ai
def lisp(expr: str) -> bool:
    """A lisp interpreter in plain Python, don't use external libraries."""
    pass

Now, write a few unit-tests, for example in test_list.py, to define what the function should do:

import unvibe
from lisp import lisp


# You can also inherit unittest.TestCase, but unvibe.TestCase provides a better reward function
class LispInterpreterTestClass(unvibe.TestCase):
    def test_calculator(self):
        self.assertEqual(lisp("(+ 1 2)"), 3)
        self.assertEqual(lisp("(* 2 3)"), 6)

    def test_nested(self):
        self.assertEqual(lisp("(* 2 (+ 1 2))"), 6)
        self.assertEqual(lisp("(* (+ 1 2) (+ 3 4))"), 21)

    def test_list(self):
        self.assertEqual(lisp("(list 1 2 3)"), [1, 2, 3])

    def test_call_python_functions(self):
        self.assertEqual(lisp("(list (range 3)"), [0, 1, 2])
        self.assertEqual(lisp("(sum (list 1 2 3)"), 6)

Now, let's use UnitAI to search for a valid implementation that passes all the tests:

$ python -m unvibe test_lisp.py

The library will re-run the tests and generate many alternatives, and keep exploring the ones that pass more tests, while feeding back the test errors to the LLM. In the end you will find a new file called unvibe_lisp.py with a valid implementation. If multiple valid implementations are found, you will find them in the folder.

Setup & Configuration

$ pip install unvibe

Write in your project folder a .unvibe.toml config file.

# For example, to use Claude:
[ai]
provider = "claude"
api_key = "sk-..."
model = "claude-3-5-haiku-latest"
max_tokens = 5000

# Or, to use a local Ollama:
[ai]
provider = "ollama"
model = "deepseek-r1:8b"
host = "http://localhost:11434"

# To use OpenAI or DeepSeek API:
[ai]
provider = "openai"
base_url = "https://api.deepseek.com"
api_key = "sk-..."
temperature = 0.0
max_tokens = 1024

# To Use Gemini API:
[ai]
provider = "gemini"
api_key = "..."
model = "gemini-2.0-flash"

# Advanced Parameters to tune the search: 
[search]
random_spread = 4       # How many random tries to make before selecting the best move.
max_depth = 8           # Maximum depth of the search tree.
max_temperature = 0.3   # Picks random temperatures up to this value.
# Some models perform better at lower temps, in general
# Higher temperature = more exploration

Research

This approach has been explored in various research papers. For example, from "LLM-based Test-driven Interactive Code Generation: User Study and Empirical Evaluation" (Microsoft Research) https://arxiv.org/abs/2404.10100v1:

Our results are promising with using the OpenAI Codex LLM on MBPP: our best algorithm improves the pass@1 code generation accuracy metric from 48.39% to 70.49% with a single user query, and up to 85.48% with up to 5 user queries. Second, we can generate a non-trivial functional unit test consistent with the user intent within an average of 1.69 user queries for 90.40% of the examples for this dataset.

Related Article

For more information, check the original article: Unvibe: Generate code that passes unit-tests

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

unvibe-0.1.1.tar.gz (17.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

unvibe-0.1.1-py3-none-any.whl (20.9 kB view details)

Uploaded Python 3

File details

Details for the file unvibe-0.1.1.tar.gz.

File metadata

  • Download URL: unvibe-0.1.1.tar.gz
  • Upload date:
  • Size: 17.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.0 CPython/3.11.11 Darwin/23.6.0

File hashes

Hashes for unvibe-0.1.1.tar.gz
Algorithm Hash digest
SHA256 bae0659f2d0f203975541ca48d758cf6258d52780e7b817d365ab0b0c5479f0c
MD5 50c831dc358f5dbf30c343e79c0277ec
BLAKE2b-256 40a33fe1ad337694ef010147517437efa9c9903a466b4f7afeabb7ef78d13292

See more details on using hashes here.

File details

Details for the file unvibe-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: unvibe-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 20.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.0.0 CPython/3.11.11 Darwin/23.6.0

File hashes

Hashes for unvibe-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 7c108d2270cd0cabde6a44d624f894dc4514f7ed0cf18e79d050a3c703308d46
MD5 fd252a8d25ad35ba552adeafebbe8f6d
BLAKE2b-256 6dc51f3dbf85d49f1b6e8efb64df3df16d529a09f94478585097eae0b4411c62

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page