Takes a given directory and parses its contents to create a text vectorstore to be consumed in prompts for various LLM models.

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Project description

Creating Unit Tests using OpenAI

Introduction

The original intent of this codebase was to perform prompt engineering via "vectorization" of a java codebase and then feeding the embedded text to openAI for it to automatically generate unit tests. More languages and LLMs will eventually be supported, and the use cases aren't necessarily limited to unit test generation.

This repository contains several unrelated/experimental files based on past iterations, but in general the module lives in the src/llm_prompt_creator directory.

The instructions in this README are kept up to date as much as possible.

Contributing

Note that the main branch is locked down but does allow merge requests.

To contribute, create a feature or fix branch (prepended with feature_ or fix_ respectively), commit your changes there and then create a pull request from your branch into main.

We will review & (after approval) merge your git branch and then delete the remote branch on our github repo to limit left-over branches.

Set Up

Note Windows users may need to install the Visual Studio C++ Compiler to use this package

Simple example usage:

# Below is an example on how to set the OpenAI key,
# it has to be above the "langchain" and "llm_prompt_creator" import.
# Create an "openai-key.txt" in the same directory as your test.py file.

import os

with open('openai-key.txt', 'r') as f:
    key = f.read().strip()
    os.environ["OPENAI_API_KEY"] = key


from langchain.chat_models import ChatOpenAI
from llm_prompt_creator import prompt as PR

from llm_prompt_creator import prompt as PR
dir = "<path to your java codebase directory>"

# Chunk & store your codebase as tokenized chunks via javalang.
# Defaults to store succesfully chunked files in "./chunks.json".

PR.chunker(dir)

"""
You could optionally store the chunks strictly in memory by instead using the below when chunking your
directory:
"""
#data = PR.chunker(dir, write_to_disk=False)

"""
Create a vector store to perform a similarity search against when asking questions to your
LLM. Defaults to consume from the "./chunks.json" file.
"""
store = PR.create_vectorstore()

"""
If opting to save the store to disk, use the below instead which passes a
directory where the store will be saved. It will also load the store into
memory for follow on commands.
"""
#PR.create_vectorstore(persist_directory="db")
#store = PR.load_vectorstore(persist_directory="db")

# Start an open-ended chat conversation with your LLM based on your vector store.
# Will continue prompting the user for inputs until they type 'exit'.
# Subject to model limitations (especially token limits).
PR.prompt(store=store, llm=ChatOpenAI(model="gpt-4",temperature=0))

"""
To show the context provided (provided by the vector store based on the user's question)
uncomment the below:
"""
#PR.prompt(store, show_context=True)

"""
To not write the accumulated context to disk while still displaying context in terminal, use the below:
"""

#PR.prompt(store, show_context=True, write_to_disk=False)

"""
To provide a custom prompt template or a list of questions to be automatically prompted for, use the filePath parameter.
The file should be a json file with properties of promptTemplate and questions. An example file can be found below:
"""
{
"promptTemplate": "",
"questions": ["question 1", "question 2"]
}


#PR.prompt(store, show_context=True, filePath="./file_input.json", llm=ChatOpenAI(model="gpt-4",temperature=0))

Following the example should yield a similar response to the below image (subject to LLM model used and codebase):

TODO

Refactoring across the board, particularly to reduce the number of called Python scripts.
Optimize chunker to allow larger codebase directories
Establish a standard way of calculating token limits.
Use token limits to dynamically adjust the amount of context and therefore the number of tokens used during a prompt/completion instance with OpenAI.
Containerize this solution so we can deploy it; one for parsing and chunking, another for creating a vector-store and prompting (or something like it).

Project details

These details have not been verified by PyPI

Project links

License
- OSI Approved :: MIT License
Operating System
- OS Independent
Programming Language
- Python :: 3

Release history Release notifications | RSS feed

This version

0.5.0

Jul 28, 2023

0.4.0

Jul 27, 2023

0.3.0

Jul 19, 2023

0.2.16

Jul 17, 2023

0.2.15

Jun 30, 2023

0.2.14

Jun 30, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

llm_prompt_creator-0.5.0.tar.gz (7.5 kB view details)

Uploaded Jul 28, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

llm_prompt_creator-0.5.0-py3-none-any.whl (7.9 kB view details)

Uploaded Jul 28, 2023 Python 3

File details

Details for the file llm_prompt_creator-0.5.0.tar.gz.

File metadata

Download URL: llm_prompt_creator-0.5.0.tar.gz
Upload date: Jul 28, 2023
Size: 7.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for llm_prompt_creator-0.5.0.tar.gz
Algorithm	Hash digest
SHA256	`68431a1d71744e531d69387504a592d7d91f41c4d0834f960c358c21867704eb`
MD5	`ee4c60c3ed4b64288489cbb35581900c`
BLAKE2b-256	`0cca2ae2823c71715578190ad7d6cd6c3e7be8a7812d00b9c2b172d784711e1e`

See more details on using hashes here.

File details

Details for the file llm_prompt_creator-0.5.0-py3-none-any.whl.

File metadata

Download URL: llm_prompt_creator-0.5.0-py3-none-any.whl
Upload date: Jul 28, 2023
Size: 7.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for llm_prompt_creator-0.5.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`c064a34c7143acf078273acc670210b3e1b5ed07830d25426f0b8256f913ae44`
MD5	`52f912332f28d146c4bb18d7d2f20489`
BLAKE2b-256	`0c4ff9d60e54bde3e06783a7eba3c91a0b9949ca14826e45627f52043cdf3d7a`

See more details on using hashes here.

llm-prompt-creator 0.5.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Creating Unit Tests using OpenAI

Introduction

Contributing

Set Up

TODO

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes