Skip to main content

Trivial Extensible Job-submission system

Project description

Clusters typically come with job-submission and queueing systems. These systems handle a queue of jobs, which might spawn multiple nodes, have a priorities, dependencies, expected runtimes, deadlines…

Tej doesn’t aim at doing any of that. It just allows you to submit a job to a single server, that will run it immediately, and allow you to check its status and get its results later on.

Of course, tej is extensible, which allows you to add some queueing and scheduling abilities should you want to.

The goal of tej is to be usable without having to configure the server beforehand; it will setup the structure it needs on the server on the first run if necessary (in its simplest form, a ~/.tej directory on the server, that will contain the jobs).

Usage

Sets up tej on the server (optional, else it gets setup on the first run, with default options):

$ tej setup user@server.hostna.me \
    --queue /scratch/tejqueue \
    --make-link ~/.tej \
    --plugin default

This takes a destination to SSH into, the location of tej’s directory (there can be several on a server; by default, ~/.tej is used), --make-link creates a link so that future invocations will be redirected to /scract/tejqueue, and --plugin selects which plugins to setup on the server (since tej is extensible, other scheduling/running subsystems might be added in the future).

Submit a simple job:

$ tej submit user@server.hostna.me myjobdir
Job submitted as:
myjobdir_user_123456

Here myjobdir is assumed to have the default layout, and no metadata is added. The directory will be uploaded in its entirety, and start.sh will be run.

Submit a job explicitely:

$ tej submit user@server.hostna.me --queue=/scratch/tejqueue \
    --id example_job \
    --script bin/jobinit \
    myjobdir
Job submitted as:
example_job

Get the status of a job:

$ tej status user@server.hostna.me --id myjobdir_user_123456
Job is still running (1:28:57)
$ tej status user@server.hostna.me --queue=/scratch/tejqueue \
    --id example_job
Job is finished (1:30:01)
$ tej status user@server.hostna.me --id myjobdir_user_567890
No job 'myjobdir_user_567890'

Download the output from a finished job:

$ tej download user@server.hostna.me --id myjobdir_user_123456 \
    output/log.txt
$ tej download user@server.hostna.me --id myjobdir_user_123456 \
    results.csv view.png input.bin

Note that there is no need for the file to be an output. The files are downloaded to the current directory.

Kill a running job:

$ tej kill user@server.hostna.me --id example_job
Job 'example_job' has already completed
$ tej kill user@server.hostna.me --id myjobdir_user_123456
Job 'myjobdir_user_123456' killed
$ tej kill user@server.hostna.me --id myjobdir_user_567890
No job 'myjobdir_user_567890'

Cleanup a finished job:

$ tej delete user@server.hostna.me --id example_job
Deleted job 'example_job'

Note that this is still alpha software. The command-line interface, in particular, is likely to evolve. Feel free to give me your opinion on it or direct me your feature requests/patches on Github.

Name

“tej” /tɛʒ/ is French slang for throwing/casting. It’s intended here to be used as a verb (“let me tej it to the server…”, “Is it done yet? I tej’d that yesterday!”). Probably not the best name, but it wasn’t taken, and it’s short.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tej-0.2.3.tar.gz (13.6 kB view hashes)

Uploaded Source

Built Distribution

tej-0.2.3-py2.py3-none-any.whl (19.3 kB view hashes)

Uploaded Python 2 Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page