Skip to main content

A thin wrapper to retrieve sequence from JAMO at NERSC and on Dori.

Project description

jamofetch

thin wrapper on JAMO to allow sequence retrieval on Dori and at NERSC

Installation

Jamofetch requires python 3.8 or higher.

Jamofetch is available on PyPi: https://pypi.org/project/jamofetch/.

$ pip install jamofetch

Usage

See script docs/demo_script.py in the project for a sample script.

Create a JamoFetcher:

from jamofetch.jamofetch import JamoFetcher, LibSeq

WAIT_INTERVAL = 10  # check if JAMO has provisioned sequence every 10 seconds
WAIT_MAX = 7200     # max wait for sequence is 7200 seconds or 2 hours

# directory where JAMO will link sequence, will be created if it doesn't exist
link_dir = '/tmp/sequence-links'

# create a fetcher
fetcher: JamoFetcher = JamoFetcher(link_dir=link_dir, wait_interval_secs=WAIT_INTERVAL, wait_max_secs=WAIT_MAX)

Note the following default configuration parameters for LibSeq. Setting wait_max_secs to -1 causes the JamoFetcher instance to wait indefenitely for JAMO sequence.

class JamoFetcher():
    def __init__(self, link_dir='.', wait_interval_secs=10, wait_max_secs=-1):

Fetch sequence for a library, print path of symlink to sequence file and the real path to the file.

LIBRARY = 'NPUNN'
# Call JAMO to link sequence in the background.  Symlinks to the sequence
# files are created by JAMO in the directory specified by the
# link_dir parameter supplied to the JamoFetcher constructor.
lib_seq: LibSeq = fetcher.fetch_lib_seq(LIBRARY)

printf(f"library name: {lib_seq.get_lib_name()}")
print(f"sequence symlink: {lib_seq.get_seq_path()}")
print(f"sequence real path: {lib_seq.get_real_path()}")

Check if sequence has been provided by JAMO, i.e. the symlink is not broken. Wait for sequence if it isn't ready.

if lib_seq.seq_exists():
    print("sequence ready")
else:
    # wait for JAMO
    real_path = lib_seq.get_real_path_wait()
    print(f"sequence ready at {real_path}")

Command Line Tool

Installing jamofetch with pip exposes a command line interface.

(venv) [dnscott@ln005 jamofetch]$ jamofetch  -h
usage: jamofetch [-h] [-l LIBRARY] [-d DIRECTORY] [-i INTERVAL] [-m MAX] [-w] [--logging LOGGING]

options:
  -h, --help            show this help message and exit
  -l LIBRARY, --library LIBRARY
                        library name(s) for which to retrieve sequence
  -d DIRECTORY, --directory DIRECTORY
                        directory where to link sequence, defaults to current directory. Directory will be created if it doesn't exit.
  -i INTERVAL, --interval INTERVAL
                        wait interval in seconds to check if sequence has been fetched, ignored if wait flag not set
  -m MAX, --max MAX     maximum time to wait for sequence in seconds, ignored if wait flag not set. Specify -1 to wait indefinetely.
  -w, --wait            wait for jamo to link sequence, then print "sequence ready"
  --logging LOGGING     logging level (specify DEBUG for verbose logging)
(venv) [dnscott@ln005 jamofetch]$ jamofetch -d data -l NPUNN -l NOOHG -l HOGH -w --max -1
fetching sequence:

apptainer --silent run docker://doejgi/jamo-dori jamo link -s dori library NPUNN
NPUNN /global/dna/dm_archive/sdm/pacbio/00/27/47/pbio-2747.27352.bc1001_BAK8A_OA--bc1001_BAK8A_OA.ccs.fastq.gz BACKUP_COMPLETE 6391936239a7711d789a9380
NPUNN /clusterfs/jgi/groups/dsi/homes/dnscott/git/jamofetch/data/NPUNN.pbio-2747.27352.bc1001_BAK8A_OA--bc1001_BAK8A_OA.ccs.fastq.gz

apptainer --silent run docker://doejgi/jamo-dori jamo link -s dori library NOOHG
NOOHG /global/dna/dm_archive/sdm/pacbio/00/26/91/pbio-2691.26653.bc1001_BAK8A_OA--bc1001_BAK8A_OA.ccs.fastq.gz RESTORED 6347dbb35bc59487d7e768d6
NOOHG /clusterfs/jgi/groups/dsi/homes/dnscott/git/jamofetch/data/NOOHG.pbio-2691.26653.bc1001_BAK8A_OA--bc1001_BAK8A_OA.ccs.fastq.gz

apptainer --silent run docker://doejgi/jamo-dori jamo link -s dori library HOGH
HOGH /global/dna/dm_archive/sdm/illumina/00/63/97/6397.2.44053.GGCTAC.fastq.gz RESTORED 51d52a82067c014cd6ef4f6f
HOGH /clusterfs/jgi/groups/dsi/homes/dnscott/git/jamofetch/data/HOGH.6397.2.44053.GGCTAC.fastq.gz

sequence links:
HOGH symlink: /clusterfs/jgi/groups/dsi/homes/dnscott/git/jamofetch/data/HOGH.6397.2.44053.GGCTAC.fastq.gz
HOGH realpath: /clusterfs/jgi/scratch/dsi/aa/dm_archive/sdm/illumina/00/63/97/6397.2.44053.GGCTAC.fastq.gz
NOOHG symlink: /clusterfs/jgi/groups/dsi/homes/dnscott/git/jamofetch/data/NOOHG.pbio-2691.26653.bc1001_BAK8A_OA--bc1001_BAK8A_OA.ccs.fastq.gz
NOOHG realpath: /clusterfs/jgi/scratch/dsi/aa/dm_archive/sdm/pacbio/00/26/91/pbio-2691.26653.bc1001_BAK8A_OA--bc1001_BAK8A_OA.ccs.fastq.gz
NPUNN symlink: /clusterfs/jgi/groups/dsi/homes/dnscott/git/jamofetch/data/NPUNN.pbio-2747.27352.bc1001_BAK8A_OA--bc1001_BAK8A_OA.ccs.fastq.gz
NPUNN realpath: /clusterfs/jgi/scratch/dsi/aa/dm_archive/sdm/pacbio/00/27/47/pbio-2747.27352.bc1001_BAK8A_OA--bc1001_BAK8A_OA.ccs.fastq.gz

waiting for JAMO to provision sequence . . . .
HOGH sequence ready
NOOHG sequence ready
NPUNN sequence ready

Credits

Jamofetch uses Will Holtz's doejgi/jamo-dori Docker image to call JAMO on the Dori cluster.

jamofetch was created with cookiecutter and the py-pkgs-cookiecutter template.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

jamofetch-3.7.6.tar.gz (5.7 kB view details)

Uploaded Source

Built Distribution

jamofetch-3.7.6-py3-none-any.whl (6.7 kB view details)

Uploaded Python 3

File details

Details for the file jamofetch-3.7.6.tar.gz.

File metadata

  • Download URL: jamofetch-3.7.6.tar.gz
  • Upload date:
  • Size: 5.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.8.8 Linux/4.18.0-481.el8.x86_64

File hashes

Hashes for jamofetch-3.7.6.tar.gz
Algorithm Hash digest
SHA256 61257ccb2d8f9a8e00980ec13d6bfd537a13458c6f164fba7614e2fd5aa05af6
MD5 74b4275c8a0ff8dd8ced6323a5c4a5a1
BLAKE2b-256 6959958154df47a579a8e837a2c8ed06be6dec19fb27554ca0e45cfb73737251

See more details on using hashes here.

File details

Details for the file jamofetch-3.7.6-py3-none-any.whl.

File metadata

  • Download URL: jamofetch-3.7.6-py3-none-any.whl
  • Upload date:
  • Size: 6.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.5.1 CPython/3.8.8 Linux/4.18.0-481.el8.x86_64

File hashes

Hashes for jamofetch-3.7.6-py3-none-any.whl
Algorithm Hash digest
SHA256 b0ccab98c56bff3d614f1817336999d7dd10d56fbbaec6a5236d9bea7bcf0c6c
MD5 a345746859d8203d22b26a359e27eeb4
BLAKE2b-256 a095b64654e76d46def474e3f45877e01f403364bd0d949a0affe3639c87e977

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page