ArchiveTeam seesaw kit
Project description
Seesaw toolkit
An asynchronous toolkit for distributed web processing. Written in Python and named after its behavior, it supports concurrent downloads, uploads, etc.
This toolkit is well-known for Archive Team projects. It also powers the Archive Team warrior.
Installation
Requires Python 2 or 3.
Needs the Tornado library for event-driven I/O. The complete list of Python modules needed are listed in requirements.txt.
How to try it out
To run the example pipeline:
sudo pip install -r requirements.txt
./run-pipeline --help
./run-pipeline examples/example-pipeline.py someone
Point your browser to http://127.0.0.1:8001/
.
You can also use run-pipeline2
or run-pipeline3
to be explicit for the Python version.
Overview
General idea: a set of Task
s that can be combined into a Pipeline
that processes Item
s:
- An
Item
is a thing that needs to be downloaded (a user, for example). It has properties that are filled by theTask
s. - A
Task
is a step in the download process: it takes an item, does something with it and passes it on. Example Tasks: getting an item name from the tracker, running a download script, rsyncing the result, notifying the tracker that it's done. - A
Pipeline
represents a sequence ofTask
s. To make a seesaw script for a new project you'd specify a newPipeline
.
A Task
can work on multiple Item
s at a time (e.g., multiple Wget downloads). The concurrency can be limited by wrapping the task in a LimitConcurrency
Task
: this will queue the items and run them one-by-one (e.g., a single Rsync upload).
The Pipeline
needs to be fed empty Item
objects; by controlling the number of active Item
s you can limit the number of items. (For example, add a new item each time an item leaves the pipeline.)
With the ItemValue
, ItemInterpolation
and ConfigValue
classes it is possible to pass item-specific arguments to the Task
objects. The value of these objects will be re-evaluated for each item. Examples: a path name that depends on the item name, a configurable bandwidth limit, the number of concurrent downloads.
Consult the wiki for more information.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file seesaw2-00.10.3.tar.gz
.
File metadata
- Download URL: seesaw2-00.10.3.tar.gz
- Upload date:
- Size: 152.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9827c5e02387c85a612221e84841c0e16d3ecbd452989cae7a6e6048acaa8429 |
|
MD5 | 34a940c8a3bc4015f84cf3f1fdbf1535 |
|
BLAKE2b-256 | d6b1cf4af0a5f5c07ef6799c057825793f60217fed5694367f33193f27ab0ee8 |
File details
Details for the file seesaw2-00.10.3-py3-none-any.whl
.
File metadata
- Download URL: seesaw2-00.10.3-py3-none-any.whl
- Upload date:
- Size: 149.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f83949e4171f3b3202b226067f313ce41014829d6a9c8efb3d66973cac0dfb62 |
|
MD5 | f609a32ef380edd8afcc296670f526a8 |
|
BLAKE2b-256 | b899115473a5c2b482728b4c335c8afa67a0b5fa8c7098920e5cca4ad760ea52 |