No project description provided
Project description
bettertest 📝🔍
⚡ A Python testing library for automatically evaluating and tracing LLM applications ⚡
Our goal with bettertest is to simplify the process of testing and debugging LLM applications. It automatically evaluates your model's responses against your solution answers (auto-eval) and provides tracing features out-of-the-box.
With bettertest, you can automatically test your LLM applications and view print statements for each run just by adding 'bettertest' to any print statement in your code.
Getting Started
Before using BetterTest, you need to install it via pip:
pip install bettertest
After installation, import the BetterTest library in your Python project:
from bettertest import BetterTest
Using BetterTest
Example Project
!pip install bettertest
from bettertest import BetterTest
questions = ['what does a complex QA instance mean?',
'<@1098583990618837062> how can I train an instance based off of a websites documentation, without using PDF? could i web scrape the content and then upload that content? i am trying to ask questions based off of a websites documentation, and im not sure how to convert the data from the website easily',
'<@1098583990618837062> how can i train an instance based off of a websites documentation?',
'<@1098583990618837062> What does the ai model see the data as? Does it see it as the large corpus of text it is trained on or something else?']
def call_openai(question):
index = questions.index(question)
model_answers = ['A complex QA instance refers to a type of search in which users ask questions that require multiple pieces of context. For example, if a user asks "What is the age of the actress who plays Meg in Family Guy?", the system needs to know both "Who plays Meg in Family Guy?" and "What is Mila Kunis\'s age?" to provide an accurate answer. To handle this, the user question is broken down into sub-components, and the most relevant chunks for each sub-question are found and fed into the chatGPT/GPT-4 model to get the answer.\n \n REFERENCES\n Discord Chat; https://docs.berri.ai/api-reference/app_configurations/file_configuration;',
"To train an instance based off of a website's documentation without using a PDF, you can scrape the content from the website and then upload it as a data source. You can use web scraping tools such as Beautiful Soup or Scrapy to extract the text from the website and then upload it using the BerriAI API. Once you have uploaded the data source, you can use it to train your instance and ask questions based on the website's documentation. However, please note that web scraping may be subject to legal restrictions, so make sure to check the website's terms of service and consult with a legal professional if necessary.\n \n REFERENCES\n Discord Chat;",
'To train an instance based off of a website\'s documentation, you can use the "Recursive URLs" feature provided by BerriAI. This feature allows you to recursively parse URLs and extract text data from them, which can then be used to train your instance. \n \n\n Here\'s an example of how you can use this feature:\n \n\n 1. First, create a data source using the "create_data_source" endpoint. You can specify the URL of the website you want to train your instance on as the "data_source" parameter. \n \n\n 2. Next, use the "create_app" endpoint to create an instance of your GPT-4 app. In the payload, specify the "data_source" parameter as the ID of the data source you created in step 1. \n \n\n 3. Finally, use the "finetune_instance" endpoint to fine-tune your instance using the data extracted from the website. You can pass in a list of JSON objects containing context and correct_response attributes to this endpoint. \n \n\n Keep in mind that the quality of the instance you train will depend on the quality of the data you extract from the website. It\'s important to ensure that the data you extract is relevant and accurate.\n \n REFERENCES\n Discord Chat;',
'Hello <@1098583990618837062>, according to the BerriAI API documentation, the AI model sees the data as embeddings, which are numerical representations of the text. These embeddings are generated using a pre-trained language model, such as GPT-3, and are used to make predictions and generate responses. The model does not see the data as the large corpus of text it is trained on, but rather as a set of numerical vectors that represent the semantic meaning of the text.\n \n REFERENCES\n Discord Chat;']
print("bettertest: testing if this works")
return model_answers[index]
answers = ['Sometimes users ask questions that require multiple pieces of context (e.g. What is the age of the actress who plays Meg in Family Guy? -> This requires us to know - Who plays Meg in Family Guy? Mila Kunis + What is Mila Kunis’s age?). To tackle this, we first run the user question through chatGPT, and have it break down that question into sub-components (as seen in the previous example) -> Find the most relevant chunks for each sub-question -> Feed that into chatGPT/GPT-4/whichever model you chose to get the answer to the users question.',
"You can either create an app with an input_url this way you won't need to scrape the website https://docs.berri.ai/api-reference/endpoint/create_app if this does not work you can also manually scrape the website yourself and create an instance with raw text by passing in JSON chunks to berri",
"You can either create an app with an input_url this way you won't need to scrape the website https://docs.berri.ai/api-reference/endpoint/create_app if this does not work you can also manually scrape the website yourself and create an instance with raw text by passing in JSON chunks to berri",
"Berri ingests your data, chunks it, creates embeddings. When user's ask questions, berri does a similarity search and retrieves the most similar chunks to answer the questions. These chunks are then fed into the llm to answer the question"]
bt = BetterTest("krrish@berri.ai", "YOUR_OPENAI_API_KEY")
bt.eval(questions, answers, call_openai)
Initialize BetterTest
Create an instance of the BetterTest class with the user's email:
bt = BetterTest("your_email@example.com", "your_openai_api_key")
Replace "your_email@example.com"
with the appropriate email address.
Replace "your_openai_api_key"
with your openai api key. Here's where to find it.
Evaluate Model Responses
The eval()
function takes in a list of questions, a list of answers, an LLM function, and an optional num_runs
argument. It automatically evaluates the model's response against the solution answer and provides tracing for each run. Use it as follows:
questions = [...] # List of questions
answers = [...] # List of corresponding solution answers
def llm_function(question):
# Your custom LLM function implementation goes here
pass
bt = BetterTest("your_email@example.com")
bt.eval(questions, answers, llm_function)
Replace the llm_function
with your LLM function, and customize num_runs
if necessary. By default, num_runs
is set to 1.
Contributing
We welcome contributions to InstructPrompt! Feel free to create issues/PR's/or DM us (👋 Hi I'm Krrish - +17708783106)
Changelog
The current version of BetterTest is 0.1.9
.
License
BetterTest is released under the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file bettertest-0.1.98.tar.gz
.
File metadata
- Download URL: bettertest-0.1.98.tar.gz
- Upload date:
- Size: 12.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.11.3 Darwin/19.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d2dc19f59167357b8d99ee840de44d37bfae361ee5a01ba792ce3873da07befa |
|
MD5 | 0f8c50be28df0864e2342b1c2f64546c |
|
BLAKE2b-256 | ad6b6dbbf8f535e97031666948c3323a58b51c7094512fc7b215d0f9281c5776 |
File details
Details for the file bettertest-0.1.98-py3-none-any.whl
.
File metadata
- Download URL: bettertest-0.1.98-py3-none-any.whl
- Upload date:
- Size: 15.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.5.1 CPython/3.11.3 Darwin/19.6.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | cd1245ce80727ea35517f62ebf111f357a020de3563cb2c6cdd5aed65cf7a1cb |
|
MD5 | 03818449958e5d0b1e0120b0379a2ad8 |
|
BLAKE2b-256 | 454106203b3432c9a6fedfd7c5842b41508839b3dfd4f16064a1e19f2c42c3fb |