Python wrapper for the datamover tools that enable moving data between the ZIH fileserver and the taurus cluster at TU Dresden
Project description
Taurus Datamover
Python wrapper for the datamover tools that enable moving data between the ZIH fileserver and the taurus cluster at TU Dresden
Before you can start working on the taurus cluster, you need to file a project request. You also need a group share on the ZIH fileserver.
Getting started
If you own a project space on taurus and want to transfer files to it, you can ask HPC support to enable a connection to your fileserver.
Afterwards, you can use TUD ZIH datamover to transfer files from the fileserver to the cluster.
This python package provides a python wrapper for the datamover tools.
Usage
First import taurus_datamover
and create a Datamover
object
from taurus_datamover import Datamover, waitfor
dm = Datamover()
The object will have all datamover commands as methods. The most commonly used methods are:
dtls
list directory contents. Equivalent of the the linux commandls
.dtcp
copy files likecp
dtmv
move files likemv
dtrm
delete files likerm
All commands take the same arguments like the linux equivalents. Arguments are passed as arguments to the method. For example, the --help
option tells you how to use them:
out, _ = dm.dtls('--help').communicate()
print(out.decode('utf-8'))
returns (truncated):
Usage: ls [OPTION]... [FILE]...
List information about the FILEs (the current directory by default).
Sort entries alphabetically if none of -cftuvSUX nor --sort is specified.
Mandatory arguments to long options are mandatory for short options too.
-a, --all do not ignore entries starting with .
-A, --almost-all do not list implied . and ..
--author with -l, print the author of each file
[...]
The --help
argument is a little special, because it returns immediately. Normally, the commands run on a separate node in the background.
For example, listing the contents of the directory /grp/g_biapol
proc = dm.dtls('-lh','/grp/g_biapol')
initially just prints some messages from the clusters queuing system:
srun: job 27549105 queued and waiting for resources
(some time passing here, until the job gets assigned to a node...)
srun: job 27549105 has been allocated resources
srun: error: ioctl(TIOCGWINSZ): Inappropriate ioctl for device
srun: error: Not using a pseudo-terminal, disregarding --pty option
the error messages are normal. they are caused by the fact that we did not run the command from an interactive terminal.
The whole process takes a while. therefore Datamover
executes the commands in the background and returns immediately, so that you can do other stuff in the background. You can check the status with poll()
. As long as that returns None
, the process is still in progress:
proc.poll()
None
And if the process finished it will return the exit code (0 in case of success, an integer larger than 0 in case of an error):
proc.poll()
0
If your code needs to wait for the result, use the waitfor
helper function, which waits for the result and returns the exit code of the command:
proc = dm.dtls('-lh','/grp/g_biapol')
waitfor(proc)
Waiting ..
srun: job 27549647 queued and waiting for resources
......................................................................................................................
srun: job 27549647 has been allocated resources
.
srun: error: ioctl(TIOCGWINSZ): Inappropriate ioctl for device
srun: error: Not using a pseudo-terminal, disregarding --pty option
.
0
Only error messages are printed immediately, the normal output of the function is captured and can be retrieved like this:
out, _ = proc.communicate()
print(out.decode('utf-8'))
total 12K
drwx------ 20 roha044c 1111111 4.0K Jul 8 13:59 data
drwx------ 3 mazo260d 1111111 4.0K Jun 20 11:49 presentations
drwx------ 3 johamuel 1111111 4.0K Mar 10 14:53 projects
Examples
Recursively (-r
) copy the directory data/test
from the fileserver to the project space:
waitfor(dm.dtcp('-r','/grp/g_biapol/data/test', '/projects/p_bioimage/'))
once it is done, we can see (and read) the files on the project space:
!ls -la /projects/p_bioimage/test/
total 12
drwxrwsr-x 3 tkorten p_bioimage 4096 Aug 3 16:31 .
drwxrws--T 6 root p_bioimage 4096 Aug 1 13:52 ..
drwx--S--- 2 tkorten p_bioimage 4096 Aug 3 16:31 test
However, from a normal node, we don't have write access:
!rm -r /projects/p_bioimage/test/test
rm: cannot remove '/projects/p_bioimage/test/test': Read-only file system
But using the datamover, we can write to the project space:
waitfor(dm.dtrm('-r', '/projects/p_bioimage/test'))
!ls -la /projects/p_bioimage/test/
ls: cannot access '/projects/p_bioimage/test/': No such file or directory
Contributing
Contributions are very welcome. Tests can be run with tox, please ensure the coverage at least stays the same before you submit a pull request.
License
Distributed under the terms of the BSD-3 license, "biapol-taurus" is free and open source software.
Support
If you need support with the tools listed here, please open an issue.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for taurus_datamover-0.0.13-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | ec29560cdaeebd718a42fe0334610581ce9682692a1a736a9ffc6518d302c53b |
|
MD5 | 86514d5824ebf34456323ee70b1af1f5 |
|
BLAKE2b-256 | f3e71a6d5a46b5c6d90d4b494be0430ab6b76f9acbbf66775732377e6a3e9703 |