Computation engine for Seamless: execute checksum-addressed transformations in Python or bash
Project description
seamless-transformer
seamless-transformer is the computation engine of the Seamless framework. It takes a transformation — a pure-functional computation defined as a checksum-addressed dict of inputs, code, and language — and executes it, returning a result checksum. It supports Python and bash transformations, multi-process worker pools with shared-memory IPC, and integration with the Seamless caching and remote infrastructure.
Core concepts
A transformation in Seamless is a deterministic computation: given the same inputs and code (identified by their checksums), it always produces the same output. seamless-transformer is responsible for:
- Building the transformation dict from the inputs and code, then computing its checksum (which serves as the transformation's identity for caching).
- Building the execution namespace: resolving input buffers, compiling modules, injecting dependencies.
- Executing the code — either Python (via
exec) or bash (via subprocess with file-mapped pins). - Returning the result as a checksum, which can be cached and reused.
Worker pool
For production use, seamless-transformer can spawn a pool of worker processes (seamless_transformer.worker.spawn()). Workers run in separate processes using the spawn multiprocessing context, and communicate with the parent via a custom IPC channel built on multiprocessing.Connection and shared memory.
- The parent distributes transformation requests to the least-loaded worker.
- Workers can delegate sub-transformations back to the parent (which redistributes them).
- Buffer data is exchanged through shared memory to avoid serialization overhead.
- Workers automatically restart on crash (segfault, etc.).
Integration with the Seamless ecosystem
- seamless-core: provides the
Checksum,Buffer, and buffer-cache primitives thatseamless-transformerbuilds on. - seamless-dask: optionally offloads transformations to a Dask cluster (
TransformationDaskMixin). - seamless-remote: used by the transformation cache to (a) look up cached results in the database before running, (b) access the buffer server for buffer data, and (c) submit transformations to the jobserver for remote execution (an alternative to local execution, not a cache lookup).
- seamless-config: supplies project/stage selection for storage routing.
- seamless-jobserver: depends on
seamless-transformerto execute transformations received from the job queue.
CLI scripts
Installing seamless-transformer provides:
| Command | Description |
|---|---|
seamless-run |
The CLI face of Seamless: wrap a bash command or pipeline as a transformation, using file/directory argument names as pin names |
seamless-upload |
Upload input files/directories to the buffer server and write .CHECKSUM sidecar files, staging inputs for seamless-run |
seamless-download |
Fetch result files/directories from the buffer server using .CHECKSUM sidecar files produced by seamless-run |
seamless-run-transformation |
Universal transformation executor: run any Seamless transformation (Python, bash, or other) by checksum and print the result checksum |
seamless-queue |
Run a queue server that executes seamless-run --qsubmit jobs concurrently — the CLI face's parallelization mechanism beyond & |
seamless-queue-finish |
Signal the queue server to drain remaining jobs and shut down |
seamless-mode-bind.sh |
Shell script: source it to bind seamless-mode commands and hotkeys into the current shell session |
Installation
pip install seamless-transformer
Setting up seamless-mode
After installing, seamless-mode-bind.sh is available on your PATH. Source it in your shell session to activate the seamless-mode-on, seamless-mode-off, seamless-mode-toggle commands and the Ctrl-U U hotkey.
Manual (any environment) — add to ~/.bashrc or ~/.zshrc:
source $(which seamless-mode-bind.sh)
Conda — auto-activate with the environment:
cp $(which seamless-mode-bind.sh) $CONDA_PREFIX/etc/conda/activate.d/
venv / virtualenv — append to the environment's activate script:
echo "source $(which seamless-mode-bind.sh)" >> $VIRTUAL_ENV/bin/activate
virtualenvwrapper — add to the environment's postactivate hook:
echo "source $(which seamless-mode-bind.sh)" >> $VIRTUAL_ENV/bin/postactivate
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file seamless_transformer-0.1.5.tar.gz.
File metadata
- Download URL: seamless_transformer-0.1.5.tar.gz
- Upload date:
- Size: 108.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c35bbbdb4b430229a70ad73f55e2373881d2502c8ac181dfbf30f555c95f6ac6
|
|
| MD5 |
82cd7eb6d36666b15718441cce01bbd2
|
|
| BLAKE2b-256 |
50b8e3fad4e15c634da2cc499676294efe76a57749d163e88b390c20558ed1b6
|
File details
Details for the file seamless_transformer-0.1.5-py3-none-any.whl.
File metadata
- Download URL: seamless_transformer-0.1.5-py3-none-any.whl
- Upload date:
- Size: 116.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.23
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f088de581ad8e72e6cbae00d95e4b0df486c450a2cfcb5c18ea85e7716d394af
|
|
| MD5 |
aca04ee9284f4df775b0ad9fe9ec3841
|
|
| BLAKE2b-256 |
7076e1ab9b2891eff4222fdf5bad570d7774cf98e0aeb68b84fff22b55a16080
|