Intextum processing worker: HTTP-polling Docling/FFmpeg document, image and audio pipeline.
Project description
intextum-worker
The Intextum processing worker: an HTTP-polling worker that pulls tasks from an Intextum API instance and runs the Docling / FFmpeg document, image and audio pipeline (OCR, ASR, chunking, classification, content enrichment, embeddings).
The worker is always-remote: it downloads source files from and uploads results to the API over HTTP, so it does not need a shared volume and can run anywhere — including on a host with a GPU while the rest of the stack runs in Docker.
Install
Pick the bundle that matches your accelerator. The macOS (Apple MPS) wheels are on PyPI, so it installs with no extra flags:
pip install 'intextum-worker[mps]'
Linux CPU and NVIDIA CUDA pull their Torch build from the PyTorch index, so add
the matching --extra-index-url:
# Linux, CPU only
pip install 'intextum-worker[cpu]' --extra-index-url https://download.pytorch.org/whl/cpu
# Linux, NVIDIA CUDA 12.6
pip install 'intextum-worker[cuda]' --extra-index-url https://download.pytorch.org/whl/cu126
Available extras: mps, cpu, cuda, cpu-document (document/image only), plus
the granular document, asr, enrichment stacks.
Run
export API_URL="https://your-intextum-host" # the API to poll
export WORKER_TOKEN="<token from the Add Worker dialog>"
intextum-worker --capabilities document,video,image
intextum-worker --help lists all flags. Every flag also has an environment
variable (API_URL, WORKER_TOKEN, WORK_DIR, CAPABILITIES, POLL_INTERVAL,
CLASSIFICATION_DEVICE, DOCLING_OCR_ENGINE, …); CLI flags take precedence.
Development
This package uses a src/ layout. The repo-root VERSION file is the single
source of truth for the version; it is staged into worker/VERSION at build time
(worker/VERSION is gitignored).
cp ../VERSION VERSION # stage the version for an editable install
pip install -e '.[mps,test]' # or [cpu,test] / [cuda,test]
pytest
export API_URL="http://localhost:8000"
export WORKER_TOKEN="<token from the Add Worker dialog>"
export CLASSIFICATION_DEVICE="mps" # macOS Apple Silicon; use cpu/cuda elsewhere
intextum-worker --capabilities document,video,image,training
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file intextum_worker-0.1.8.tar.gz.
File metadata
- Download URL: intextum_worker-0.1.8.tar.gz
- Upload date:
- Size: 112.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f27e61adc391c59c68f3df9ac2e27336e9689b166aaa66f3827f338dfc04f94f
|
|
| MD5 |
57e44656a3ff65f890b39bb1d8c8ece2
|
|
| BLAKE2b-256 |
e436dcc673abaf1190f1dd236012328032a3cdae00ecbdb697e4fe21c60e2793
|
Provenance
The following attestation bundles were made for intextum_worker-0.1.8.tar.gz:
Publisher:
release-worker.yml on intextum/intextum
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
intextum_worker-0.1.8.tar.gz -
Subject digest:
f27e61adc391c59c68f3df9ac2e27336e9689b166aaa66f3827f338dfc04f94f - Sigstore transparency entry: 1935358010
- Sigstore integration time:
-
Permalink:
intextum/intextum@a035838b959824dcde6696b3f0399f2fdb2dabd5 -
Branch / Tag:
refs/tags/v0.1.8 - Owner: https://github.com/intextum
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-worker.yml@a035838b959824dcde6696b3f0399f2fdb2dabd5 -
Trigger Event:
push
-
Statement type:
File details
Details for the file intextum_worker-0.1.8-py3-none-any.whl.
File metadata
- Download URL: intextum_worker-0.1.8-py3-none-any.whl
- Upload date:
- Size: 93.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a51f92bbff16d1db48c98030b11d8f36130193e7de684425241732c018a31207
|
|
| MD5 |
485a5470f3cdb543083ead575e81a720
|
|
| BLAKE2b-256 |
13ba619504413e3d2defad7f229d3d6c3a3863452e368a30901045de642114b5
|
Provenance
The following attestation bundles were made for intextum_worker-0.1.8-py3-none-any.whl:
Publisher:
release-worker.yml on intextum/intextum
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
intextum_worker-0.1.8-py3-none-any.whl -
Subject digest:
a51f92bbff16d1db48c98030b11d8f36130193e7de684425241732c018a31207 - Sigstore transparency entry: 1935358019
- Sigstore integration time:
-
Permalink:
intextum/intextum@a035838b959824dcde6696b3f0399f2fdb2dabd5 -
Branch / Tag:
refs/tags/v0.1.8 - Owner: https://github.com/intextum
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
release-worker.yml@a035838b959824dcde6696b3f0399f2fdb2dabd5 -
Trigger Event:
push
-
Statement type: