Skip to main content

ArchiveTeam seesaw kit

Project description

Seesaw toolkit

An asynchronous toolkit for distributed web processing. Written in Python and named after its behavior, it supports concurrent downloads, uploads, etc.

This toolkit is well-known for Archive Team projects. It also powers the Archive Team warrior.

Build Status Coverage Status

Installation

Requires Python 2 or 3.

Needs the Tornado library for event-driven I/O. The complete list of Python modules needed are listed in requirements.txt.

How to try it out

To run the example pipeline:

sudo pip install -r requirements.txt
./run-pipeline --help
./run-pipeline examples/example-pipeline.py someone

Point your browser to http://127.0.0.1:8001/.

You can also use run-pipeline2 or run-pipeline3 to be explicit for the Python version.

Overview

General idea: a set of Tasks that can be combined into a Pipeline that processes Items:

  • An Item is a thing that needs to be downloaded (a user, for example). It has properties that are filled by the Tasks.
  • A Task is a step in the download process: it takes an item, does something with it and passes it on. Example Tasks: getting an item name from the tracker, running a download script, rsyncing the result, notifying the tracker that it's done.
  • A Pipeline represents a sequence of Tasks. To make a seesaw script for a new project you'd specify a new Pipeline.

A Task can work on multiple Items at a time (e.g., multiple Wget downloads). The concurrency can be limited by wrapping the task in a LimitConcurrency Task: this will queue the items and run them one-by-one (e.g., a single Rsync upload).

The Pipeline needs to be fed empty Item objects; by controlling the number of active Items you can limit the number of items. (For example, add a new item each time an item leaves the pipeline.)

With the ItemValue, ItemInterpolation and ConfigValue classes it is possible to pass item-specific arguments to the Task objects. The value of these objects will be re-evaluated for each item. Examples: a path name that depends on the item name, a configurable bandwidth limit, the number of concurrent downloads.

Consult the wiki for more information.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seesaw2-00.10.3.tar.gz (152.7 kB view details)

Uploaded Source

Built Distribution

seesaw2-00.10.3-py3-none-any.whl (149.8 kB view details)

Uploaded Python 3

File details

Details for the file seesaw2-00.10.3.tar.gz.

File metadata

  • Download URL: seesaw2-00.10.3.tar.gz
  • Upload date:
  • Size: 152.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.7

File hashes

Hashes for seesaw2-00.10.3.tar.gz
Algorithm Hash digest
SHA256 9827c5e02387c85a612221e84841c0e16d3ecbd452989cae7a6e6048acaa8429
MD5 34a940c8a3bc4015f84cf3f1fdbf1535
BLAKE2b-256 d6b1cf4af0a5f5c07ef6799c057825793f60217fed5694367f33193f27ab0ee8

See more details on using hashes here.

File details

Details for the file seesaw2-00.10.3-py3-none-any.whl.

File metadata

  • Download URL: seesaw2-00.10.3-py3-none-any.whl
  • Upload date:
  • Size: 149.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.3 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.9.7

File hashes

Hashes for seesaw2-00.10.3-py3-none-any.whl
Algorithm Hash digest
SHA256 f83949e4171f3b3202b226067f313ce41014829d6a9c8efb3d66973cac0dfb62
MD5 f609a32ef380edd8afcc296670f526a8
BLAKE2b-256 b899115473a5c2b482728b4c335c8afa67a0b5fa8c7098920e5cca4ad760ea52

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page