Remote Archiver: safely collect output files into archives on network filesystem
Project description
ReAr
Remote Archiver: safely collect output files into archives on network filesystem
Replacement of open()
for scenario where multiple processes generate lots of (log) files on a network filesystem. ReAr redirects the writes to Zip files to reduce the stress on the filesystem and to keep things organized. Writing to archive is chunked and staged to avoid single point of failure.
# On each worker:
async with rear_fs("/path/to/archive_base"):
with rear_open("ar.zip/relpath/to/file", 'w+b') as f: # open a read-write buffer ...
#with rear_pickup("/path/to/temp-file", "ar.zip/relpath/to/file"): # ... or pick up a file created by others
f.write(b"...")
# The file is written to a tmp archive on closing.
# It will then be moved and eventually stored as `relpath/to/file` in zip file `/path/to/archive_base/ar.zip`.
To avoid concurrent write, each worker writes to a temporary Zip file, and they create a new one every 5 minutes. Run a scavenger to collect the files in the temporary archives into the final archives:
# On your main process:
async with scavengerd("/path/to/archive_base"):
...
# ... or to do it manually
while :; do
rear-scavenger -d /path/to/archive_base
sleep 5m
done
FAQ
What happens if a worker instance crashes?
Its current temporary archive will end up missing the central directory list as it is not properly closed. Scavenger will try to recover the files as much as possible (with zip -FF
).
How does the scavenger works?
Multiple processes cannot write to one Zip file at the same time, so each first deposit the files to individual temporary Zip files and record where those files should be saved eventually. When a temporary Zip file is closed (after the process exit or after 5 minutes), Scavenger copies all files to their destination Zip files. Scavenger does not need to watch for incoming files actively since it can organize them any time after they are saved to the temporary Zip files. It is also safe to run multiple Scavenger instances at any time: it will check if it is necessary before performing any action.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file rear-0.1.2.tar.gz
.
File metadata
- Download URL: rear-0.1.2.tar.gz
- Upload date:
- Size: 6.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/0.0.0 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d350e4685ea9e2bed68766b8126db1cfb1071340511c411b4ada1a423e2c68ad |
|
MD5 | 88cef8694b02d33ae7521309e89bf97b |
|
BLAKE2b-256 | a324a2e075bbcf2f16c5c724a4226b0684772ca897f6ccb11ce5fc8c03b8ad05 |
File details
Details for the file rear-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: rear-0.1.2-py3-none-any.whl
- Upload date:
- Size: 7.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/0.0.0 CPython/3.10.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7872902002c7e328c2bb11c7cdea5dc3c7c71bcb347e8dfea279637dd856e89f |
|
MD5 | 4276b568b048ccdddc35e44a727d2830 |
|
BLAKE2b-256 | cb0b658b4588a5dcbc0726288308754add8b89c1bfa30ccb5d2a9b17472ede24 |