Python package for generating training data from documents.
Project description
SpiceJack
SpiceJack is a tool for generating json questions and answers from documents in python.
Usage
from spicejack.pdf import PDFprocessor
def filter1(list):
"""
Example filter
"""
return [i.replace("badword","b*dword") for i in list]
processor = PDFprocessor(
"/path/to/Tax_Evasion_Tutorial.pdf",
use_legitimate = True, # Runs the processor with the openai api (See "legitimate use")
filters = (filter1,) # Extra custom filters
)
processor.run(
thread = True # Runs the processor in a child thread. (threading.Thread)
process = True # Runs the processor in a child thread. (multiprocessing.Process)
logging = True # Prints the responses from the LLM
)
Legitimate use
Create a file named .env and put this:
OPENAI_API_KEY = "<YOUR-OPENAI-API-KEY>"
Installation
pip install spicejack
Support me
You can use SpiceJack for completely free, but donations are very appreciated as I am making this on an 10+ year old laptop.
Bitcoin
bc1q7xaxer2xpxttm3vpzc8s9dutvck8u9ercxxc95
Ethereum
0xB7351e098c80E2dCDE48BB769ac14c599E32c47E
Monero
44Y47Sf2huJV4hx7K1JrTeKbgkPsWdRWSbEiAHRWKroaGYAnxkPjdxhUsDeiFeQ3wc6Tw8v3uYTZMbBUfcdUUgqt5HCqbtY
Litecoin
LQzd9phuN7iPRn8p5rT1zyVssJ8nY5WjM5
Roadmap
-
Python library
-
Mass generation
-
GUI
Star History
License
|
|
This project is licensed under GNU_GPL_v3.0. |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file spicejack-0.45b0.tar.gz.
File metadata
- Download URL: spicejack-0.45b0.tar.gz
- Upload date:
- Size: 43.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
87b764412945aa4b55b0496c112d8b8e6e03f1b55efe5f4221e464a3115d69ff
|
|
| MD5 |
19f892638c459a55173af60914fe32af
|
|
| BLAKE2b-256 |
f8b2bc65def45d72ab3dc7b1bc52a095574d5a3a91d7366ed143ece6f44d6203
|
File details
Details for the file spicejack-0.45b0-py3-none-any.whl.
File metadata
- Download URL: spicejack-0.45b0-py3-none-any.whl
- Upload date:
- Size: 32.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/5.1.1 CPython/3.9.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1a3bb461020f61dfafb3ce4acbb005f86375a955aeb4cab2dd97abf99d2efa0c
|
|
| MD5 |
4e246b21d6b10c56d4076b67c85a8c93
|
|
| BLAKE2b-256 |
e0463d5559aa80fa23404e0f5d7411aefc1c2dd4076a74a8ec45d15e8382177c
|