Skip to main content

Compress fastq with spring

Project description

License: MIT Build Status - GitHub codecov CodeFactor

Crunchy

A python wrapper around spring and cram (samtools) to compress fastq to spring and bam to cram. When compressing fastqs to spring an integrity check can be performed by using flag: crunchy compress spring --spring-path <springfile> --first <read_1.fastq> --second <read_2.fastq> --check-integrity

Install

Pip

pip install crunchy

Docker

This will install crunchy as well as samtools and spring within the container.

docker pull clinicalgenomics/crunchy:0.5

Run crunchy using:

docker run clinicalgenomics/crunchy:0.5 crunchy

Developers

git clone https://github.com/Clinical-Genomics/crunchy
pip install -e .
crunchy --help
Usage: crunchy [OPTIONS] COMMAND [ARGS]...

  Base command for crunchy

                .---. .---.
               :     : o   :    me want cookie!
           _..-:   o :     :-.._    /
       .-''  '  `---' `---' "   ``-.
     .'   "   '  "  .    "  . '  "  `.
    :   '.---.,,.,...,.,.,.,..---.  ' ;
    `. " `.                     .' " .'
     `.  '`.                   .' ' .'
      `.    `-._           _.-' "  .'  .----.
        `. "    '"--...--"'  . ' .'  .'  o   `.
        .'`-._'    " .     " _.-'`. :       o  :
      .'      ```--.....--'''    ' `:_ o       :
    .'    "     '         "     "   ; `.;";";";'
   ;         '       "       '     . ; .' ; ; ;
  ;     '         '       '   "    .'      .-'
  '  "     "   '      "           "    _.-'

Options:
  --spring-binary TEXT            Path to spring binary  [default: spring]
  --samtools-binary TEXT          Path to spring binary  [default: samtools]
  -t, --threads INTEGER           Number of threads to use for spring
                                  compression  [default: 8]
  -r, --reference TEXT            Path to reference genome
  --log-level [DEBUG|INFO|WARNING]
                                  Choose what log messages to show
  --tmp-dir TEXT                  If specific temp dir should be used
  --help                          Show this message and exit.

Commands:
  auto        Run whole pipeline by compressing, comparing and deleting...
  compare     Compare two files by generating checksums.
  compress    Compress genomic files
  decompress  Decompress genomic files

Workflow

Each command can be run separately. To compress all fastq pairs below a directory run crunchy auto spring <path_to_dir>.

  1. Recursively find all fastq pairs

  2. Compress all pairs with spring file_1.fastq + file_2.fastq (spring)-> file.spring

  3. Decompress with spring file.spring (spring)-> file_1.spring.fastq + file_2.spring.fastq

  4. Compare checksum with previous file_1.spring.fastq + file_1.fastq (hashlib)-> compare

  5. Delete fastq (If the compression was lossless) file_1.fastq + file_2.fastq (rm)->

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Files for crunchy, version 0.6
Filename, size File type Python version Upload date Hashes
Filename, size crunchy-0.6-py2.py3-none-any.whl (17.9 kB) File type Wheel Python version py2.py3 Upload date Hashes View
Filename, size crunchy-0.6.tar.gz (13.8 kB) File type Source Python version None Upload date Hashes View

Supported by

Pingdom Pingdom Monitoring Google Google Object Storage and Download Analytics Sentry Sentry Error logging AWS AWS Cloud computing DataDog DataDog Monitoring Fastly Fastly CDN DigiCert DigiCert EV certificate StatusPage StatusPage Status page