Skip to main content

Shard input to persistent jobs at raw pipe speed.

Project description

partake

Shard input to persistent parallel jobs at the rate of a pipe.

Example

seq 0 10000000 | partake -n 10 'cat'

Install

CLI version:

pip install partake

With development dependencies:

pip install partake[dev]

Recipes

Split a SAM/BAM/CRAM file into 2 .bam chunks.

samtools view input.bam | partake -n 2 -s 1 -o {id}.bam "bash -c 'cat <(samtools view -H input.bam) - | samtools view -b'"

Here, we convert the input to SAM (plaintext) and pipe the records to stdin. We use the -s 1 option to ensure that lines are not truncated. In the command, we prepend the header to each worker's input record stream and convert back to bam. Output is 0.bam and 1.bam.

Read global line indexes within a custom Python script:

report.py

import os
import sys
read_fd = os.environ.get("PARTAKE_LINE_NUMBERS")
if read_fd is not None:
    read_fd = int(read_fd)
    line_numbers = os.fdopen(read_fd, buffering=1)

for line_number, data_line in zip(line_numbers, sys.stdin):
    print("Line number: ", line_number.strip(), "Data line: ", data_line.strip())
seq 1000 10000000 | partake -n 2 -s 1 "python report.py"
head 0.out
Line number:  0 Data line:  1000
Line number:  1 Data line:  1001
Line number:  2 Data line:  1002
Line number:  3 Data line:  1003

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

partake-0.5.2.tar.gz (7.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

partake-0.5.2-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file partake-0.5.2.tar.gz.

File metadata

  • Download URL: partake-0.5.2.tar.gz
  • Upload date:
  • Size: 7.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for partake-0.5.2.tar.gz
Algorithm Hash digest
SHA256 53c67f0c10f800758345687abec53f67ed90ea5475b8570bea7326789d35d856
MD5 2ba5193a80c816fe53b6e330a438b1c7
BLAKE2b-256 3aa29d6b1a1e290d22646ef320186a4108a5a91ec3e8e33b199d550f7435845e

See more details on using hashes here.

File details

Details for the file partake-0.5.2-py3-none-any.whl.

File metadata

  • Download URL: partake-0.5.2-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.9

File hashes

Hashes for partake-0.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 ab80703e174a325ecc1c57d99b23c147d138b1032dd66525a57b0131c20db909
MD5 832ff250e93023e855c7c61694303b41
BLAKE2b-256 5aa1194e3e5e2c0bb999fb32b16d6a9e1320f330dce53a0e36890c1c77497ecb

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page