Takes a given directory and parses its contents to create a text vectorstore to be consumed in prompts for various LLM models.

These details have not been verified by PyPI

Project links

Project description

Creating Unit Tests using OpenAI

Introduction

The original intent of this codebase was to perform prompt engineering via "vectorization" of a java codebase and then feeding the embedded text to openAI for it to automatically generate unit tests. More languages and LLMs will eventually be supported, and the use cases aren't necessarily limited to unit test generation.

This repository contains several unrelated/experimental files based on past iterations, but in general the module lives in the src/llm_prompt_creator directory.

The instructions in this README are kept up to date as much as possible.

Contributing

Note that the main branch is locked down but does allow merge requests.

To contribute, create a feature or fix branch (prepended with feature_ or fix_ respectively), commit your changes there and then create a pull request from your branch into main.

We will review & (after approval) merge your git branch and then delete the remote branch on our github repo to limit left-over branches.

Set Up

Note Windows users may need to install the Visual Studio C++ Compiler to use this package

Simple example usage:

from llm_prompt_creator import prompt as PR
dir = "<path to your java codebase directory>"

# Chunk & store your codebase as tokenized chunks via javalang.
# Defaults to store succesfully chunked files in "./chunks.json".
PR.chunker(dir)

# Create a vector store to perform a similarity search against when asking questions to your
# LLM. Defaults to consume from the "./chunks.json" file.
store = PR.create_store()

# Start an open-ended chat conversation with your LLM based on your vector store.
# Will continue prompting the user for inputs until they type 'exit'.
# Subject to model limitations (especially token limits).
PR.prompt(store)

"""
To show the context provided (provided by the vector store based on the user's question)
uncomment the below:
"""
# PR.prompt(store, show_context=True)

Following the example should yield a similar response to the below image (subject to LLM model used and codebase):

TODO

Refactoring across the board, particularly to reduce the number of called Python scripts.
Optimize chunker to allow larger codebase directories
Establish a standard way of calculating token limits.
Use token limits to dynamically adjust the amount of context and therefore the number of tokens used during a prompt/completion instance with OpenAI.
Containerize this solution so we can deploy it; one for parsing and chunking, another for creating a vector-store and prompting (or something like it).

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.5.0

Jul 28, 2023

0.4.0

Jul 27, 2023

0.3.0

Jul 19, 2023

This version

0.2.16

Jul 17, 2023

0.2.15

Jun 30, 2023

0.2.14

Jun 30, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_prompt_creator-0.2.16.tar.gz (7.4 kB view hashes)

Uploaded Jul 17, 2023 Source

Built Distribution

llm_prompt_creator-0.2.16-py3-none-any.whl (6.5 kB view hashes)

Uploaded Jul 17, 2023 Python 3

Hashes for llm_prompt_creator-0.2.16.tar.gz

Hashes for llm_prompt_creator-0.2.16.tar.gz
Algorithm	Hash digest
SHA256	`37a4aa9f2b44be71fb3755abd21f35edf3796f452a8aac0281337dc8fbd810cc`
MD5	`db36f67aeb18c7797a3c79202b6a719f`
BLAKE2b-256	`9faa297c4f307a75fd35d26627317247599b36b22eb9efc819fe58db662a9b1b`

Hashes for llm_prompt_creator-0.2.16-py3-none-any.whl

Hashes for llm_prompt_creator-0.2.16-py3-none-any.whl
Algorithm	Hash digest
SHA256	`9f705a7b5fdbc79483fd1a86d9e06a5643e92e988d2b2a2bb64779694cafa4c0`
MD5	`299707586f6d010c31ea57766e33354d`
BLAKE2b-256	`4820b69a0b6f28e6e72ca48a51ed3f794c14bcea4e44f4c66bc1bc91864f71b7`