Skip to main content

Compress fastq with spring

Project description

License: MIT Build Status - GitHub codecov CodeFactor

Crunchy

A python wrapper around spring and cram (samtools) to compress fastq to spring and bam to cram. When compressing fastqs to spring an integrity check can be performed by using flag: crunchy compress spring --spring-path <springfile> --first <read_1.fastq> --second <read_2.fastq> --check-integrity

Install

Pip

pip install crunchy

Docker

This will install crunchy as well as samtools and spring within the container.

docker pull clinicalgenomics/crunchy:0.5

Run crunchy using:

docker run clinicalgenomics/crunchy:0.5 crunchy

Developers

git clone https://github.com/Clinical-Genomics/crunchy
pip install -e .
crunchy --help
Usage: crunchy [OPTIONS] COMMAND [ARGS]...

  Base command for crunchy

                .---. .---.
               :     : o   :    me want cookie!
           _..-:   o :     :-.._    /
       .-''  '  `---' `---' "   ``-.
     .'   "   '  "  .    "  . '  "  `.
    :   '.---.,,.,...,.,.,.,..---.  ' ;
    `. " `.                     .' " .'
     `.  '`.                   .' ' .'
      `.    `-._           _.-' "  .'  .----.
        `. "    '"--...--"'  . ' .'  .'  o   `.
        .'`-._'    " .     " _.-'`. :       o  :
      .'      ```--.....--'''    ' `:_ o       :
    .'    "     '         "     "   ; `.;";";";'
   ;         '       "       '     . ; .' ; ; ;
  ;     '         '       '   "    .'      .-'
  '  "     "   '      "           "    _.-'

Options:
  --spring-binary TEXT            Path to spring binary  [default: spring]
  --samtools-binary TEXT          Path to spring binary  [default: samtools]
  -t, --threads INTEGER           Number of threads to use for spring
                                  compression  [default: 8]
  -r, --reference TEXT            Path to reference genome
  --log-level [DEBUG|INFO|WARNING]
                                  Choose what log messages to show
  --tmp-dir TEXT                  If specific temp dir should be used
  --help                          Show this message and exit.

Commands:
  auto        Run whole pipeline by compressing, comparing and deleting...
  compare     Compare two files by generating checksums.
  compress    Compress genomic files
  decompress  Decompress genomic files

Workflow

Each command can be run separately. To compress all fastq pairs below a directory run crunchy auto spring <path_to_dir>.

  1. Recursively find all fastq pairs

  2. Compress all pairs with spring file_1.fastq + file_2.fastq (spring)-> file.spring

  3. Decompress with spring file.spring (spring)-> file_1.spring.fastq + file_2.spring.fastq

  4. Compare checksum with previous file_1.spring.fastq + file_1.fastq (hashlib)-> compare

  5. Delete fastq (If the compression was lossless) file_1.fastq + file_2.fastq (rm)->

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

crunchy-1.0.2.tar.gz (15.3 kB view hashes)

Uploaded Source

Built Distribution

crunchy-1.0.2-py3-none-any.whl (18.2 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page