Skip to main content

File search tool using OpenAI assistant.

Project description

File search tools using OpenAI Assistant

Work in progress, still trying polish a few features and getting some initial feedback.

Installation (pip)

pip install lumei

Usage

Example

The following is an example of processing a list of pdf files and extracting the vendor and price data from the files. The command requires an OpenAI API key which can be obtained from here https://platform.openai.com/account/api-keys.

lumei \
  --input-files ~/folder_1/*.pdf,~/folder_2/*.pdf \
  --output-file ~/Desktop/output.json \
  --openai-api-key=<OPENAI_API_KEY> \
  --query="[
  	{'name': 'vendor', 'search': 'Name of the vendor who issued the invoice.'}, 
  	{'name': 'price', 'search': 'Total bill from the invoice.'}
  ]"

Input Parameters

--input-files

Source files to process on. Multiple files can be provided, and they are seperated by a comma "," character. File inputs can be expressed as a path to a single file or a regex.

--output-file

Path of the file that the results will be written to. Input must be a file path to a single file. Supported file formate are ".csv", ".xlsx", and ".json".

--openai-api-key [Optional]

API key for OpenAI, necessary for file search functionalities. Key can be obtained from here https://platform.openai.com/account/api-keys.

Alternative way to provide the API key is to set it as the "OPENAI_API_KEY" environment variable.

--query

Name and the description of data to search for. Input should be an array of JSON objects. name is the name of the data to search for. Name of the data will be the column name for the result dataset. search is the description of the data to search for.

Example:

[
    {
        'name': 'vendor', 
        'search': 'Name of the vendor who issued the invoice.'
    }, 
    {
        'name': 'price', 
        'search': 'Total bill from the invoice.'
    }
]

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lumei-0.3.0.tar.gz (44.0 kB view details)

Uploaded Source

Built Distribution

lumei-0.3.0-py3-none-any.whl (33.6 kB view details)

Uploaded Python 3

File details

Details for the file lumei-0.3.0.tar.gz.

File metadata

  • Download URL: lumei-0.3.0.tar.gz
  • Upload date:
  • Size: 44.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for lumei-0.3.0.tar.gz
Algorithm Hash digest
SHA256 e0d665fcb25efe50a208f9359e3990c262243a8a67ed56c4c5f38351a502d18e
MD5 00739345902fa999c572bb534ac0799d
BLAKE2b-256 862cea9d82ea289afa7b131b9e99c895bce3ce5b60eccfff9d7d6aa6eeeaeb19

See more details on using hashes here.

File details

Details for the file lumei-0.3.0-py3-none-any.whl.

File metadata

  • Download URL: lumei-0.3.0-py3-none-any.whl
  • Upload date:
  • Size: 33.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.12.6

File hashes

Hashes for lumei-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 4c93451ba7bfd6d2db29c3f77dde91d169e59c9b427fa9f5b61eaf386e6b6379
MD5 877936fe8e36a4454e9f37fab3719aee
BLAKE2b-256 df94883efe4700ae0322f56a80c3317fddfed7602c1c74a0c7597891071a46b9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page