Document summarization API service
Project description
wordmill
A work-in-progress.
wordmill is a document summary API service. It provides a simple REST API which accepts requests to summarize a document. Under the hood, it reaches out to a LLM hosted on any OpenAI-compatible API service.
It is designed to abstract away the "AI" details from your user base. Many people in your organization want to summarize documents. Not as many of them care to know all the details related to LLMs, prompts, document prep, model selection, etc. The service will allow administrators to configure and customize these aspects based on the incoming document type. All the end users need to do is request a summary.
Setup
-
It is recommended to install pyenv.
NOTE: Make sure to follow all steps! (A, B, C, D, and so on)
-
Set up the virtual environment:
pipenv install --dev
-
Add your LLM access info into
.env, example:LLM_API_KEY=<your key> LLM_BASE_URL="https://my-llm-server:443/v1" LLM_MODEL_NAME="mistral-7b-instruct"
Running API server
To run the server:
pipenv shell
flask run
Example Usage of the API server
The service accepts a request to summarize the document and returns a URL that you should visit to check the status of your summary.
A background task reaches out to the LLM and awaits the response. Eventually, the status of your summarize task will shift to 'done' and you can view the LLM-generated content. Your task may also shift to 'error' if something went wrong.
Since these summaries do not need to be long-lived, currently we are using flask-caching's "SimpleCache" to store the data. For production purposes, the cache service used by flask-caching will need to be changed to redis or memcached
import json
import requests
import time
# load the document you wish to summarize
with open("incident.json") as fp:
data = json.load(fp)
# customize the prompt passed to the LLM (optional)
requests.post(
"http://127.0.0.1:8000/prompt",
json={"prompt": "Please summarize this document:\n\n{document}"}
)
# submit request to summarize and get background task id
id = requests.post("http://127.0.0.1:8000/summarize", json={"document": data}).json()["id"]
# repeatedly check on the 'summarize' task and wait for summary to be generated...
while True:
time.sleep(5)
summary = requests.get(f"http://127.0.0.1:8000/summary/{id}").json()
if summary["status"] == "done":
print(summary["content"])
break
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file wordmill-0.0.2.tar.gz.
File metadata
- Download URL: wordmill-0.0.2.tar.gz
- Upload date:
- Size: 41.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d1c5c7b846da5ab908f456216f88a6a7e3ce67ed2d7f7b91d5ad64ff7c09c416
|
|
| MD5 |
642492d615911b25b332aaedc1854d0b
|
|
| BLAKE2b-256 |
9e635ffa10266a87622205bf40ad06068138e6d52cc740c46c09e14fce328522
|
Provenance
The following attestation bundles were made for wordmill-0.0.2.tar.gz:
Publisher:
release.yml on RedHatInsights/wordmill
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wordmill-0.0.2.tar.gz -
Subject digest:
d1c5c7b846da5ab908f456216f88a6a7e3ce67ed2d7f7b91d5ad64ff7c09c416 - Sigstore transparency entry: 207147037
- Sigstore integration time:
-
Permalink:
RedHatInsights/wordmill@bfc74171c2a1f66680ac238dc909a6bb144c5b36 -
Branch / Tag:
refs/tags/v0.0.2 - Owner: https://github.com/RedHatInsights
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@bfc74171c2a1f66680ac238dc909a6bb144c5b36 -
Trigger Event:
push
-
Statement type:
File details
Details for the file wordmill-0.0.2-py3-none-any.whl.
File metadata
- Download URL: wordmill-0.0.2-py3-none-any.whl
- Upload date:
- Size: 9.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fa20cbb5bc1417dcd4d02bfc203fe2b02702db3d424224ce3adc362c13d6a579
|
|
| MD5 |
ebd2f5f02da002f47f3de74961260a74
|
|
| BLAKE2b-256 |
b508acf700b42b72cf3f4e397a72a56a5770e1226cd3877bca3a91cf7c58250e
|
Provenance
The following attestation bundles were made for wordmill-0.0.2-py3-none-any.whl:
Publisher:
release.yml on RedHatInsights/wordmill
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
wordmill-0.0.2-py3-none-any.whl -
Subject digest:
fa20cbb5bc1417dcd4d02bfc203fe2b02702db3d424224ce3adc362c13d6a579 - Sigstore transparency entry: 207147039
- Sigstore integration time:
-
Permalink:
RedHatInsights/wordmill@bfc74171c2a1f66680ac238dc909a6bb144c5b36 -
Branch / Tag:
refs/tags/v0.0.2 - Owner: https://github.com/RedHatInsights
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release.yml@bfc74171c2a1f66680ac238dc909a6bb144c5b36 -
Trigger Event:
push
-
Statement type: