Skip to main content

Compress fastq files with spring and check the integrity

Project description

License: MIT Build Status - GitHub codecov CodeFactor

Crunchy

A python wrapper around spring and cram (samtools) to compress fastq to spring and bam to cram. When compressing fastqs to spring an integrity check can be performed by using flag: crunchy compress spring --spring-path <springfile> --first <read_1.fastq> --second <read_2.fastq> --check-integrity

Install

Pip

pip install crunchy

Docker

This will install crunchy as well as samtools and spring within the container.

docker pull clinicalgenomics/crunchy:0.5

Run crunchy using:

docker run clinicalgenomics/crunchy:0.5 crunchy

Developers

git clone https://github.com/Clinical-Genomics/crunchy
pip install -e .
crunchy --help
Usage: crunchy [OPTIONS] COMMAND [ARGS]...

  Base command for crunchy

                .---. .---.
               :     : o   :    me want cookie!
           _..-:   o :     :-.._    /
       .-''  '  `---' `---' "   ``-.
     .'   "   '  "  .    "  . '  "  `.
    :   '.---.,,.,...,.,.,.,..---.  ' ;
    `. " `.                     .' " .'
     `.  '`.                   .' ' .'
      `.    `-._           _.-' "  .'  .----.
        `. "    '"--...--"'  . ' .'  .'  o   `.
        .'`-._'    " .     " _.-'`. :       o  :
      .'      ```--.....--'''    ' `:_ o       :
    .'    "     '         "     "   ; `.;";";";'
   ;         '       "       '     . ; .' ; ; ;
  ;     '         '       '   "    .'      .-'
  '  "     "   '      "           "    _.-'

Options:
  --spring-binary TEXT            Path to spring binary  [default: spring]
  --samtools-binary TEXT          Path to spring binary  [default: samtools]
  -t, --threads INTEGER           Number of threads to use for spring
                                  compression  [default: 8]
  -r, --reference TEXT            Path to reference genome
  --log-level [DEBUG|INFO|WARNING]
                                  Choose what log messages to show
  --tmp-dir TEXT                  If specific temp dir should be used
  --help                          Show this message and exit.

Commands:
  auto        Run whole pipeline by compressing, comparing and deleting...
  compare     Compare two files by generating checksums.
  compress    Compress genomic files
  decompress  Decompress genomic files

Workflow

Each command can be run separately. To compress all fastq pairs below a directory run crunchy auto spring <path_to_dir>.

  1. Recursively find all fastq pairs

  2. Compress all pairs with spring file_1.fastq + file_2.fastq (spring)-> file.spring

  3. Decompress with spring file.spring (spring)-> file_1.spring.fastq + file_2.spring.fastq

  4. Compare checksum with previous file_1.spring.fastq + file_1.fastq (hashlib)-> compare

  5. Delete fastq (If the compression was lossless) file_1.fastq + file_2.fastq (rm)->

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crunchy-1.0.12.tar.gz (3.0 MB view details)

Uploaded Source

Built Distribution

crunchy-1.0.12-py3-none-any.whl (3.1 MB view details)

Uploaded Python 3

File details

Details for the file crunchy-1.0.12.tar.gz.

File metadata

  • Download URL: crunchy-1.0.12.tar.gz
  • Upload date:
  • Size: 3.0 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.6 Linux/6.8.0-1014-azure

File hashes

Hashes for crunchy-1.0.12.tar.gz
Algorithm Hash digest
SHA256 6f58f136c78f7f8b62f4ecf213125edcedc213bfec1c25df276a860c2cca9c16
MD5 02cb11b3babe803fa91482a1b922c639
BLAKE2b-256 1fd271fdcb14817b0c1cb2c365a11ec65d055e306975ff4c8e915be0caf3d132

See more details on using hashes here.

File details

Details for the file crunchy-1.0.12-py3-none-any.whl.

File metadata

  • Download URL: crunchy-1.0.12-py3-none-any.whl
  • Upload date:
  • Size: 3.1 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.3 CPython/3.12.6 Linux/6.8.0-1014-azure

File hashes

Hashes for crunchy-1.0.12-py3-none-any.whl
Algorithm Hash digest
SHA256 028582035c2308bd83c8313af2c8a96fe2586faf101ff9e0fe16626733cd9b32
MD5 94150013ba5e86bd5336c314482d138b
BLAKE2b-256 5c0131cdead8a26ba6fb5c417a1e99aff428602d2816fbd8df5afa8c4af01b0b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page