A python library for runtime LLM supported code corrections.
Project description
fukkatsu 復活
Build | Status |
---|---|
MAIN BUILD |
|
DEV BUILD |
pip install fukkatsu
OpenAI API
fukkatsu requires the environmental variable OPENAI_API_KEY
to be set with your OpenAI API key.
Description
This is a proof of concept for a library that will leverage LLMs to dynamically fix and improve code during execution. Fukkatsu is the japanese word, 復活
, for "resurrection" or "revival". Metaphorically speaking, this library will attempt to fix your cars tire while you are driving it at 300 km/h.
Insane? Yes. Possible? Maybe. Fun? Definitely.
Here is a representation of what I am trying to do: https://giphy.com/gifs/tire-kNRqJCLOe6ri8/fullscreen
This concept currently only applies to interpreted languages such as Python and not to compiled languages such as C++. The very nature of interpreted languages allows us to dynamically change the code during runtime.
Furthermore, fukkatsu introduces a method to enhance ordinary functions with the power of LLMs. By decorating ordinary functions with natural language prompts, they can now dynamically adapt to unforeseen inputs.
MVP
Expand
You can find a MVP within the poc
folder. You can simply run the code via python mvp.py
. The code will simulate a failing function, which will be repaird during execution. The mvp.py code will not request a correction to an OpenAi LLM but simply ueses a mock corrected function.
Foundation
Example:
- we have a function called
my_function
which takes accepts three arguments: 'x', 'y', 'z' and returns a value calculated viax / y + z
- lets assume the function
my_function
accidentally receives the value 0 for the argument 'y' - this will cause the function to fail with a
ZeroDivisionError
becaue it was not accounted for in the original function - fukkatsu offers a second chance here via the @mvp_reanimate decorator
- the decorator will catch the error and request a correction from an OpenAi LLM such as
gpt-3.5-turbo
. - the corrected function will recieve the orignal arguments and handle the error as intended
- to get the most of the correction ability of fukkatsu, it will be paramount for the user to provide a good description of the function and its intended purpose via a well defined docstring
- fukkatsu makes sure that the LLM will receive all the necessary information to correct the function without changing its original purpose:
- Full error traceback
- original function code
- passed arguments
@mvp_reanimate
def my_function(x, y, z):
"""
function to divide x by y and add to the result z. Should return z if y is 0.
"""
result = x / y + z
return result
print(my_function(x = 1, y = 0, z= 2)) # would fail, but is corrected and returns 2
print(my_function(x = 2, y = 0, z= 10)) # would fail, but is corrected and returns 10
print(my_function(x = 9, y = 1, z= 2) + 10 ) # would not fail, returns 21.0
Please note, the example in the above is trivial however LLMs such as gpt-3.5-turbo
are able to correct more complex functions. Once the library is more mature, more experiments and examples will show if such a use case for LLMs is worthwhile.
Extra life
Here is again a representation of what I am trying to achieve: https://media.tenor.com/r5nBe8Ft6yEAAAAC/ready-player-one-extra-life.gif
The code mvp code offers now the concept of extra lives
. The idea of extra lives is to allow the user to define, per function, how often a LLM should attempt to fix errors. This will allow LLMs to futher explore other paths of fixing the code at runtime however it will also make sure to bound the runtime of the LLM.
Example:
@mvp_reanimate(lives=2)
def my_function(x, y, z):
"""
function to divide x by y and add to the result z. Should return z if y is 0.
"""
result = x / y + z
return result
The above example will allow the LLM to attempt to fix the function twice. If the LLM fails to fix the function after two attempts, a flatline error
will be raised which indicates that the LLM was not able to fix the function during runtime.
fukkatsu 0.0.1 - Extra Life
Expand
fukkatsu 0.0.1 incorporates all the features demonstrated within the MVP section and introduces the concept of additional requests. Additional requests provide users with an alternative means of giving specific instructions to the LLM when a correction to a function is required. These additional requests act as a safeguard against potential misinterpretations by the LLM.
@resurrect(lives=1, additional_req = "add to any result 1000")
def my_function(x, y, z):
"""
function to divide x by y and add to the result z. Should return z if y is 0.
"""
result = x / y + z
return result
print(my_function(x = 1, y = 0, z= 2))
print(my_function(x = 1, y = 0, z= 2)) # second function will trigger short term memory capabilities
ERROR:root:division by zero
Traceback (most recent call last):
File "xxxxxxxxxxxxxxxxxxxxx", line 20, in wrapper
result = func(*args, **kwargs)
File "xxxxxxxxxxxxxxxxxxxxx", line 6, in my_function
result = x / y + z
ZeroDivisionError: division by zero
WARNING:root:Input arguments: {'x': 1, 'y': 0, 'z': 2}
WARNING:root:
Source Code:
def my_function(x, y, z):
"""
function to divide x by y and add to the result z. Should return z if y is 0.
"""
result = x / y + z
return result
WARNING:root:Requesting INITIAL correction
WARNING:root:Received INITIAL suggestion: def my_function(x, y, z):
"""
function to divide x by y and add to the result z. Should return z if y is 0.
"""
if y == 0:
return z + 1000
else:
result = x / y + z
return result + 1000
WARNING:root:Attempt 1 to reanimate
WARNING:root:Reanimation successful, using def my_function(x, y, z):
"""
function to divide x by y and add to the result z. Should return z if y is 0.
"""
if y == 0:
return z + 1000
else:
result = x / y + z
return result + 1000
ERROR:root:division by zero
Traceback (most recent call last):
File "xxxxxxxxxxxxxxxxxxxxxxx", line 20, in wrapper
result = func(*args, **kwargs)
File "xxxxxxxxxxxxxxxxxxxxxxx", line 6, in my_function
result = x / y + z
ZeroDivisionError: division by zero
WARNING:root:Input arguments: {'x': 1, 'y': 0, 'z': 2}
WARNING:root:
Source Code:
def my_function(x, y, z):
"""
function to divide x by y and add to the result z. Should return z if y is 0.
"""
result = x / y + z
return result
WARNING:root:Correction already in memory
WARNING:root:Attempt 1 to reanimate
WARNING:root:Reanimation successful, using def my_function(x, y, z):
"""
function to divide x by y and add to the result z. Should return z if y is 0.
"""
if y == 0:
return z + 1000
else:
result = x / y + z
return result + 1000
1002
1002
fukkatsu 0.0.2 - The Ghost in the Machine
Expand
The mutate
decorator introduces a new way to enhance ordinary functions dynamically via the power of LLMs, enabling them to adapt to specific inputs. It provides users with the ability to extend the capabilities of functions through natural language prompts. Additionally, the decorator can be further extended using the resurrect
decorator. The mutate
decorator enables users to program and account for cases that are challenging or impossible to anticipate.
@resurrect(lives=1)
@mutate(request= "Check the inputs closely. Given the inputs, make sure that the function is able to handle different formats if neccessary")
def my_mutated_function(file_path: str) -> pd.DataFrame():
"""
function to read files and output a dataframes.
"""
pd.read_csv(file_path)
my_mutated_function("test_file.xlsx")
fukkatsu 0.0.3 - Laissez-faire
Expand
The mutate
and resurrect
decorators now support a new argument called allow_installs. By default, allow_installs
is set to False
. However, when set to True
, the LLM will be able to test whether suggested or used python libraries are installed on the system. If any of the libraries are not installed, the LLM will install them before continuing code execution. This argument enables the LLM to have even more freedom. Therefore, setting the argument to True should be considered carefully.
resurrect
def resurrect(lives: int = 1, additional_req: str = "", allow_installs: bool = False):
...
mutate
def mutate(request: str = "", allow_installs: bool = False):
...
Testing Capabilities
This section will conduct a series of simulations to better understand fukkatsu's potential capabilities. To achieve this, multiple error types will be tested. Each error type and scenario will consist of a total of 25 runs under the same conditions. We will test the following hypotheses:
Hypotheses testing
- H0: "The proportion of errors solved is not significantly greater than 0.5."
- H1: "The proportion of errors solved is significantly greater than 0.5."
We will consider a confidence interval of 0.05 and utilize a binomial distribution.
import scipy.stats as stats
successes = # number of successful repairs
alpha = 0.05
p_value = stats.binom_test(successes, n=25, p=0.5, alternative='greater')
print(f"p_value: {p_value}")
if p_value < alpha:
print("Reject the null hypothesis")
print("The proportion of errors solved is significantly greater than 0.5.")
else:
print("Fail to reject the null hypothesis")
print("The proportion of errors solved is not significantly greater than 0.5.")
fukkatsu will utilize the gpt-3.5-turbo
model in all simulations. For each simulation, 3 lives will be allocated. The functions will also be provided with sufficient context.
After conducting all the tests, we will finally apply a chi-square test
. This test will help determine whether there is a statistically significant difference in the fukkatsu's performance across the error types. If the test results indicate a significant association, it suggests that the effectiveness of fukkatsu varies depending on the error type.
import numpy as np
from scipy.stats import chi2_contingency
observed_counts = np.array([[10, 20], [15, 25], [5, 30]])
chi2, p_value, dof, expected = chi2_contingency(observed_counts)
print("Chi-square statistic:", chi2)
print("P-value:", p_value)
print("Degrees of freedom:", dof)
print("Expected counts:", expected)
You can see each simulation recored in the different jupyter notebooks contained within the research
directory.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.