Skip to main content

Seamless umbrella distribution (all components by default)

Project description

Seamless

Seamless: define your computation once — cache it, scale it, share it.

Most computational pipelines are already reproducible — the same inputs produce the same outputs. Wrap your code as a step with declared inputs and outputs, and Seamless gives you caching (never recompute what you've already computed) and remote deployment (run on a cluster without changing your code). Remote execution also acts as a reproducibility test: if your wrapped code runs on a clean worker and produces the same result, it is reproducible. If not, Seamless has helped you find the problem — whether it's a missing input, an undeclared dependency, or a sensitivity to platform or library versions.

Seamless wraps both Python and command-line code. In Python, direct runs a function immediately; delayed records the function for deferred or remote execution. From the shell, seamless-run wraps any command as a Seamless transformation — no Python required. In both cases, the transformation is identified by the checksum of its code and inputs: identical work always produces the same identity.

Sharing works at two levels. The lightweight path is to exchange checksums: if two researchers have computed the same transformation, they already have the same result — no data transfer needed. The concrete path is to share the seamless.db file, a portable SQLite database that maps transformation checksums to result checksums. Copy it to a colleague, a cluster, or a publication archive, and every cached result travels with it. Combined, these two paths let a lab build up a shared computation cache that grows over time and never recomputes what anyone has already computed.

What about interactivity?

This is Seamless 1.x, running on a new code architecture. Seamless 0.x offered an interactive, notebook-first workflow experience with reactive cells, Jupyter widget integration, filesystem mounting, and collaborative web interfaces. These features are being ported to the new architecture. If your work is primarily interactive/exploratory, you can use the legacy version today, or watch this space for updates.

Installation

pip install seamless-suite

This installs all standard Seamless components. For a minimal install, the core user-facing packages are:

Package Import Provides
seamless-core import seamless Checksum, Buffer, cell types, buffer cache
seamless-transformer from seamless.transformer import direct, delayed direct, delayed, seamless-run, seamless-upload, seamless-download
seamless-config import seamless.config seamless.config.init(), seamless-init

Documentation

Full documentation — including getting-started guides, cluster setup, remote execution, and reference API — is at:

https://sjdv1982.github.io/seamless/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seamless_suite-1.0.tar.gz (3.7 kB view details)

Uploaded Source

File details

Details for the file seamless_suite-1.0.tar.gz.

File metadata

  • Download URL: seamless_suite-1.0.tar.gz
  • Upload date:
  • Size: 3.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.9.23

File hashes

Hashes for seamless_suite-1.0.tar.gz
Algorithm Hash digest
SHA256 f8c72eb8474295627c4a29125aef8fb5824e64412dbd824ee50f268918ad17aa
MD5 b681726152569c7a5aab1621edd089bc
BLAKE2b-256 06cb586d2a0e094a196cdb10c14ff87c100002aaab419e0d10763b5687f8cff4

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page