Skip to main content

ArchiveTeam seesaw kit

Project description

Seesaw toolkit
==============

An attempt to write a toolkit for making seesaw scripts in Python, with support for concurrent downloads, uploads etc.

How to try it out
-----------------

To run the example pipeline:

sudo pip install -r requirements.txt
./run-pipeline --help
./run-pipeline example-pipeline.py someone

Point your browser to `http://127.0.0.1:8001/`


Description
-----------

Needs the Tornado library for event-driven I/O.

General idea: a set of `Task`s that can be combined into a `Pipeline` that processes `Item`s:

* An `Item` is a thing that needs to be downloaded (a user, for example). It has properties that are filled by the `Task`s.
* A `Task` is a step in the download process: it takes an item, does something with it and passes it on. Example Tasks: getting an item name from the tracker, running a download script, rsyncing the result, notifying the tracker that it's done.
* A `Pipeline` represents a sequence of `Task`s. To make a seesaw script for a new project you'd specify a new `Pipeline`.

A `Task` can work on multiple `Item`s at a time (e.g., multiple Wget downloads). The concurrency can be limited by wrapping the task in a `LimitConcurrency` `Task`: this will queue the items and run them one-by-one (e.g., a single Rsync upload).

The `Pipeline` needs to be fed empty `Item` objects; by controlling the number of active `Item`s you can limit the number of items. (For example, add a new item each time an item leaves the pipeline.)

With the `ItemValue`, `ItemInterpolation` and `ConfigValue` classes it is possible to pass item-specific arguments to the `Task` objects. The value of these objects will be re-evaluated for each item. Examples: a path name that depends on the item name, a configurable bandwidth limit, the number of concurrent downloads.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seesaw-0.0.15.tar.gz (93.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

seesaw-0.0.15-py2.7.egg (133.3 kB view details)

Uploaded Egg

File details

Details for the file seesaw-0.0.15.tar.gz.

File metadata

  • Download URL: seesaw-0.0.15.tar.gz
  • Upload date:
  • Size: 93.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for seesaw-0.0.15.tar.gz
Algorithm Hash digest
SHA256 5adb2d680a0c822761ab22abb1237376fe6a08a535a5d931f77c77d09a91c5dd
MD5 618cb16e1da3dcac823412d9d19fb07c
BLAKE2b-256 bbf736b0f57540a3298a624a23d03349d1b10c5bd46fc2ceae06f9258f667d73

See more details on using hashes here.

File details

Details for the file seesaw-0.0.15-py2.7.egg.

File metadata

  • Download URL: seesaw-0.0.15-py2.7.egg
  • Upload date:
  • Size: 133.3 kB
  • Tags: Egg
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for seesaw-0.0.15-py2.7.egg
Algorithm Hash digest
SHA256 0d0fc7c35bab0934ce57a460320c9c26b13768039adbc9e30fc14bfad95384c5
MD5 2a1ffbca0c1cbf923ebd13f4f93c6132
BLAKE2b-256 6a2b51d192c559256f6dc3fb9b5111b26ba9eec82706d0c77e25649007be7540

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page