Skip to main content

Shard input to persistent jobs at raw pipe speed.

Project description

partake

Shard input to persistent parallel jobs at the rate of a pipe.

Example

seq 0 10000000 | partake -n 10 'cat'

Install

CLI version:

pip install partake

With development dependencies:

pip install partake[dev]

Recipes

Split a SAM/BAM/CRAM file into 2 .bam chunks.

samtools view input.bam | partake -n 2 -s 1 -o {id}.bam "bash -c 'cat <(samtools view -H input.bam) - | samtools view -b'"

Here, we convert the input to SAM (plaintext) and pipe the records to stdin. We use the -s 1 option to ensure that lines are not truncated. In the command, we prepend the header to each worker's input record stream and convert back to bam. Output is 0.bam and 1.bam.

Read global line indexes within a custom Python script:

report.py

import os
import sys
read_fd = os.environ.get("PARTAKE_LINE_NUMBERS")
if read_fd is not None:
    read_fd = int(read_fd)
    line_numbers = os.fdopen(read_fd, buffering=1)

for line_number, data_line in zip(line_numbers, sys.stdin):
    print("Line number: ", line_number.strip(), "Data line: ", data_line.strip())
seq 1000 10000000 | partake -n 2 -s 1 "python report.py"
head 0.out
Line number:  0 Data line:  1000
Line number:  1 Data line:  1001
Line number:  2 Data line:  1002
Line number:  3 Data line:  1003

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

partake-0.5.3.tar.gz (7.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

partake-0.5.3-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file partake-0.5.3.tar.gz.

File metadata

  • Download URL: partake-0.5.3.tar.gz
  • Upload date:
  • Size: 7.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for partake-0.5.3.tar.gz
Algorithm Hash digest
SHA256 509cb47bfbc0e10e64ebf1bb5c088057d895e0e84ac271267ef96b55e81cf8c0
MD5 15563904c6255d3872f5213a83f17ed6
BLAKE2b-256 f18483b51f3ce374804077da3e250ef7e16d0d4b792b1d20987caf2ea2a4e4c9

See more details on using hashes here.

File details

Details for the file partake-0.5.3-py3-none-any.whl.

File metadata

  • Download URL: partake-0.5.3-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for partake-0.5.3-py3-none-any.whl
Algorithm Hash digest
SHA256 32ce44aabac8bafabf39550466a32ca1f74ddb9354887c171c86966ff485cc2c
MD5 c9906334a5e5020e8ef6d4610627831b
BLAKE2b-256 933c60042653533ba5a4bdb0638b0521e6f221e7ffcc3261d9df23bbf06de444

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page