Skip to main content

An end-to-end solution for processing Capture-C, Tri-C and Tiled-C data

Project description

CapCruncher

Documentation Status codecov CI Anaconda-Server Badge DOI Downloads

Analysis software for Capture-C, Tri-C and Tiled-C data.

CapCruncher is a tool designed to automate the processing of Capture-C, Tri-C and Tiled-C data from FASTQ files, the package is written in python and consists of an end-to-end data processing pipline together with a supporting command line interface to enable finer grained control. The pipeline provided is fast, robust and scales from a laptop to a computational cluster.

For further information see the documentation

Changelog

[0.2.3] - 2022-08-19

Bug Fixes

  • Fixes Tiled-C filtering error due to typo in index specification for remove_dual_capture_fragments
  • Fixed bug when annotating Tiled-C data (#166) with the sorted option that caused no data to be annotated as as a viewpoint or exclusion.
  • Fixed a bug with Tiled-C slice filtering (#166) that caused slices to be erronously filtered.
  • Fixed a bug with counting reporters in batches (this occurs when counting >1x106 slices per viewpoint) (#166)
  • Fixed a bug when merging and storing Tri-C or Tiled-C data (#166) using "capcruncher reporters store merge" or "capcruncher reporters store bins". This functionally has been re-written and now appears to work correctly.
  • Fixed a bug with plotting matrices using capcrunchers plotting capabilities (#166).
  • Fixes bug where all slices are removed from parquet files (reporter) outputs due to an upgraded dask version (the pyarrow-dataset engine has been removed). This corrects the all dataframes are empty error occurring while generating reporter statistics.

Features

  • Added option to normalised CCMatrix data using ice followed by a scaling factor.
  • Reporter merging (paraquet) has been re-written to use pyarrow directly and is now faster and better able to split datasets into smaller files for more efficient parallel querying.

Miscellaneous Tasks

  • Updated version to 0.2.3 (#165)
  • Pin conda dependency versions to speed up environment solver (#167)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

capcruncher-0.2.3.tar.gz (126.3 kB view hashes)

Uploaded Source

Built Distribution

capcruncher-0.2.3-py3-none-any.whl (149.0 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page