Skip to main content

An end-to-end solution for processing Capture-C, Tri-C and Tiled-C data

Project description

CapCruncher

Documentation Status codecov CI Anaconda-Server Badge DOI Downloads

Analysis software for Capture-C, Tri-C and Tiled-C data.

CapCruncher is a tool designed to automate the processing of Capture-C, Tri-C and Tiled-C data from FASTQ files, the package is written in python and consists of an end-to-end data processing pipline together with a supporting command line interface to enable finer grained control. The pipeline provided is fast, robust and scales from a laptop to a computational cluster.

For further information see the documentation

Changelog

[0.2.3] - 2022-08-19

Bug Fixes

  • Fixes Tiled-C filtering error due to typo in index specification for remove_dual_capture_fragments
  • Fixed bug when annotating Tiled-C data (#166) with the sorted option that caused no data to be annotated as as a viewpoint or exclusion.
  • Fixed a bug with Tiled-C slice filtering (#166) that caused slices to be erronously filtered.
  • Fixed a bug with counting reporters in batches (this occurs when counting >1x106 slices per viewpoint) (#166)
  • Fixed a bug when merging and storing Tri-C or Tiled-C data (#166) using "capcruncher reporters store merge" or "capcruncher reporters store bins". This functionally has been re-written and now appears to work correctly.
  • Fixed a bug with plotting matrices using capcrunchers plotting capabilities (#166).
  • Fixes bug where all slices are removed from parquet files (reporter) outputs due to an upgraded dask version (the pyarrow-dataset engine has been removed). This corrects the all dataframes are empty error occurring while generating reporter statistics.

Features

  • Added option to normalised CCMatrix data using ice followed by a scaling factor.
  • Reporter merging (paraquet) has been re-written to use pyarrow directly and is now faster and better able to split datasets into smaller files for more efficient parallel querying.

Miscellaneous Tasks

  • Updated version to 0.2.3 (#165)
  • Pin conda dependency versions to speed up environment solver (#167)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

capcruncher-0.2.3.tar.gz (126.3 kB view details)

Uploaded Source

Built Distribution

capcruncher-0.2.3-py3-none-any.whl (149.0 kB view details)

Uploaded Python 3

File details

Details for the file capcruncher-0.2.3.tar.gz.

File metadata

  • Download URL: capcruncher-0.2.3.tar.gz
  • Upload date:
  • Size: 126.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for capcruncher-0.2.3.tar.gz
Algorithm Hash digest
SHA256 5168d168d45bae6421da2c4a5b226c2dae4c1703f66dd3ef1689d55ec4d6a9ed
MD5 9d8d09e4392bea3fe08ee8d8b32458dd
BLAKE2b-256 e7da78b7ae1ccdca4a9c2a0abdd16e308facc1909da2e159ce2d565c05fad639

See more details on using hashes here.

File details

Details for the file capcruncher-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: capcruncher-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 149.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.13

File hashes

Hashes for capcruncher-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 9e0635de905e49f5c7e32a0b332856907743f990f6eec7c36e608da8418786b9
MD5 3f373653917ca8274081280c3739c7e0
BLAKE2b-256 37328d0b33d160aeec619f8cc86611ac7771733f9a300176f3b7fbc5f94dc51a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page