Skip to main content

Python package for generating training data from documents.

Project description

SpiceJack

SpiceJack is a tool for generating json questions and answers from documents in python.

SpiceJack

Usage

from spicejack.pdf import PDFprocessor

def filter1(list):
    """
    Example filter
    """
    return [i.replace("badword","b*dword") for i in list]


processor = PDFprocessor(
    "/path/to/Tax_Evasion_Tutorial.pdf",
    use_legitimate = True, # Runs the processor with the openai api (See "legitimate use")
    filters = (filter1,) # Extra custom filters
)

processor.run(
    thread = True # Runs the processor in a child thread. (threading.Thread)
    process = True # Runs the processor in a child thread. (multiprocessing.Process)
    logging = True # Prints the responses from the LLM
)

Legitimate use

Create a file named .env and put this:

OPENAI_API_KEY = "<YOUR-OPENAI-API-KEY>"

Installation

pip install spicejack

Support me

You can use SpiceJack for completely free, but donations are very appreciated as I am making this on an 10+ year old laptop.

Bitcoin

bc1q7xaxer2xpxttm3vpzc8s9dutvck8u9ercxxc95

Ethereum

0xB7351e098c80E2dCDE48BB769ac14c599E32c47E

Monero

44Y47Sf2huJV4hx7K1JrTeKbgkPsWdRWSbEiAHRWKroaGYAnxkPjdxhUsDeiFeQ3wc6Tw8v3uYTZMbBUfcdUUgqt5HCqbtY

Litecoin

LQzd9phuN7iPRn8p5rT1zyVssJ8nY5WjM5

Roadmap

  • Python library

  • Mass generation

  • GUI

Star History

Star History Chart

License


This project is licensed under GNU_GPL_v3.0.

(🔼 Back to top)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

spicejack-0.45b0.tar.gz (43.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

spicejack-0.45b0-py3-none-any.whl (32.0 kB view details)

Uploaded Python 3

File details

Details for the file spicejack-0.45b0.tar.gz.

File metadata

  • Download URL: spicejack-0.45b0.tar.gz
  • Upload date:
  • Size: 43.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for spicejack-0.45b0.tar.gz
Algorithm Hash digest
SHA256 87b764412945aa4b55b0496c112d8b8e6e03f1b55efe5f4221e464a3115d69ff
MD5 19f892638c459a55173af60914fe32af
BLAKE2b-256 f8b2bc65def45d72ab3dc7b1bc52a095574d5a3a91d7366ed143ece6f44d6203

See more details on using hashes here.

File details

Details for the file spicejack-0.45b0-py3-none-any.whl.

File metadata

  • Download URL: spicejack-0.45b0-py3-none-any.whl
  • Upload date:
  • Size: 32.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.20

File hashes

Hashes for spicejack-0.45b0-py3-none-any.whl
Algorithm Hash digest
SHA256 1a3bb461020f61dfafb3ce4acbb005f86375a955aeb4cab2dd97abf99d2efa0c
MD5 4e246b21d6b10c56d4076b67c85a8c93
BLAKE2b-256 e0463d5559aa80fa23404e0f5d7411aefc1c2dd4076a74a8ec45d15e8382177c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page