Skip to main content

Upload, download, and delete files easily between Local/SFTP/Cloud storage using rclone

Project description

Prefect Managed File Transfer

PyPI

Turn a prefect.io server into a managed file transfer solution. UI and Programatic creation of cron style jobs (aka Flows!) to upload and download files easily between servers. Support local, SFTP remotes plus any Cloud storage supported by rclone - so thats aws, azure, google, sharepoint, and many more out of the box.

Using prefect for managed file transfer means retries, logging, multi node and high availability come as standard - turning prefect into a reliable enterprise ready file transfer solution.

This package is not the fastest solution to move files around, but it prioritises reliability and ease of use, making it an excellent choice for replacing both quick cron job copy scripts and enterprise managed file transfer appliances.

Key features

  • Copy, move, and delete files between almost any storage system easily.
  • Reliable file operations with checksumming, file size checking etc.
  • Smart and safe moving - settings to allow/block overwriting and to only copy files if they are new or changed.
  • Unzip/Untar compressed folders after downloading them.
  • Repath files as you move them.
  • Complex filtering and ordering of files - by path, age, size etc. Pattern matching with regular expressions.
  • Leverage Prefect.IO built in scheduling and orchestration capabilities:
    • Transfer files on complex cron schedules
    • notifications on success/failure - slack, email, etc
    • Highly available server architecture - database server + multi-node workers and front ends.
  • Available as a PyPi package for integration into existing self hosted and cloud prefect deployments, and as a docker image/appiance

Example use cases:

  • Once per day SSH into my database server and copy the latest *.bkup file to a central storage location.
  • Monitor a local network share directory for new files and automatically upload them to a cloud storage bucket.
  • Schedule a weekly job to synchronize files between two remote servers.
  • Move log files from a SSH available web server older than 30 days to a cold storage location, then delete the originals.
  • Copy file yyyy-MM-dd.zip from a remote server, where yyyy-MM-dd matches todays date, to a local directory and then unzip it.
  • Download any file in an S3 bucket larger than 1GB and store it in a local directory.
  • Delete temporary files older than 7 days from a remote server to free up disk space.

Visit the full docs here.

Installation - Local

Install prefect-managedfiletransfer with pip. (Requires an installation of Python 3.10+.)

pip install prefect-managedfiletransfer
# or 
uv add prefect-managedfiletransfer

We recommend using a Python virtual environment manager such as uv, pipenv, conda or virtualenv.

In one (venv) terminal start a prefect server with logs enabled

export PREFECT_LOGGING_LEVEL="INFO"
export PREFECT_LOGGING_EXTRA_LOGGERS="prefect_managedfiletransfer"
prefect server start
# OR uv run prefect server start

There are many ways to manage infrastructure and code with prefect - here we demonstate starting a local worker:

export PREFECT_API_URL=http://127.0.0.1:4200/api
# or perhaps export PREFECT_API_URL=http://host.docker.internal:4200/api
export PREFECT_LOGGING_EXTRA_LOGGERS="prefect_managedfiletransfer"
export PREFECT_LOGGING_LEVEL="INFO"
# [Optional] add all logs: export PREFECT_LOGGING_ROOT_LEVEL="INFO"


prefect worker start --pool 'default-pool' --type process

# OR add a worker with config to spawn containers that can talk to the server API:
PREFECT_API_URL=http://host.docker.internal:4200/api uv run prefect worker start --pool 'default-pool' --type=docker  

Install the blocks using the prefect CLI

prefect block register -m prefect_managedfiletransfer

And then deploy the flows.

# deploy the flows to run locally
python -m prefect_managedfiletransfer.deploy --local

# OR deploy to run with a docker image - see deploy.py
python -m prefect_managedfiletransfer.deploy --docker

# or a version of the above using uv run:
uv run python -m prefect_managedfiletransfer.deploy --local
uv run python -m prefect_managedfiletransfer.deploy --docker

Visit the server UI http://localhost:4200.

  1. Create 2 blocks, one source and one destination
  2. On the deployments page start a transfer_files_flow. Configure your flow run to copy/move files between the 2 blocks.

Visit the full docs here. Note this a work in progress auto generated documentation site so it is not perfect.

Installation - docker

Run prefect managed file transfer in a docker container, like an applicance. See Docker hub for a list of images

Note this is ephemeral - prefect has lots of docs on how to setup a database server with it.

# run prefect server in a self-removing container port-forwarded to your local machine’s 4200 port:
docker run --rm -it -p 4200:4200 managedfiletransfer/prefect-managedfiletransfer:latest

Components

Flows

  • transfer_files_flow - a fully featured flow for transferring files between different storage locations. Supports copy and move modes.
  • upload_file_flow - a flow for uploading a file to a remote server. Supports pattern matching by date.
  • delete_files_flow - a flow for deleting files from a remote server based on pattern matching and filtering.

Blocks

  • ServerWithBasicAuthBlock - A block for connecting to a server using basic authentication.
  • ServerWithPublicKeyAuthBlock - A block for connecting to a server using public key authentication.
  • RCloneConfigFileBlock - A block for managing RClone configuration files.

Tasks

  • list_remote_files_task - A task for listing files in a remote directory.
  • download_file_task - A task for downloading a single file from a remote server.
  • upload_file_task - A task for uploading a single file to a remote server.
  • delete_file_task - A task for deleting a single file from a remote server.

Screenshot of transfer files flow

Feedback

If you encounter any bugs while using prefect-managedfiletransfer, feel free to open an issue in the prefect-managedfiletransfer repository.

Feel free to star or watch prefect-managedfiletransfer for updates too!

Contributing

If you'd like to help contribute to fix an issue or add a feature to prefect-managedfiletransfer, please propose changes through a pull request from a fork of the repository.

Here are the steps:

  1. Fork the repository
  2. Clone the forked repository
  3. Install the repository and its dependencies:
# install uv first, then
uv sync

You can also access all the prefect CLI tooling inside a uv managed venv

uv venv
source .venv/bin/activate
prefect server start
  1. Make desired changes
  2. Add tests
  3. Insert an entry to CHANGELOG.md
  4. Install pre-commit to perform quality checks prior to commit:
pre-commit install
  1. use the build script to run all the checks and tests:
./build.sh
  1. Use ./run_local.sh to deploy a local prefect server, worker, and UI to test your changes
  2. git commit, git push, and create a pull request

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

prefect_managedfiletransfer-0.4.0.tar.gz (71.1 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

prefect_managedfiletransfer-0.4.0-py3-none-any.whl (71.6 MB view details)

Uploaded Python 3

File details

Details for the file prefect_managedfiletransfer-0.4.0.tar.gz.

File metadata

File hashes

Hashes for prefect_managedfiletransfer-0.4.0.tar.gz
Algorithm Hash digest
SHA256 20cff06d531d902058cd7c49fdebf421804ca058e7c5ebb596b127d72b8c8f28
MD5 2270a698e5105b7fd1bb6f5df985aeab
BLAKE2b-256 b04be6455e3f2f6e85ed7992ad0f64ef9225b671a9b71449ac823a2bdd5d6b05

See more details on using hashes here.

Provenance

The following attestation bundles were made for prefect_managedfiletransfer-0.4.0.tar.gz:

Publisher: release.yml on ImperialCollegeLondon/prefect-managedfiletransfer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file prefect_managedfiletransfer-0.4.0-py3-none-any.whl.

File metadata

File hashes

Hashes for prefect_managedfiletransfer-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 8b37f3dd4c36ef593230deaca88b249e1a32042f2a0e720e79cf43cde9d7d438
MD5 a8c4c43a196d1f68975336f47bb0ffe9
BLAKE2b-256 d5ce6447e6373f3cde0d97d89d06fa598a7e3179ce8ca56eedcf1657505e9586

See more details on using hashes here.

Provenance

The following attestation bundles were made for prefect_managedfiletransfer-0.4.0-py3-none-any.whl:

Publisher: release.yml on ImperialCollegeLondon/prefect-managedfiletransfer

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page