Skip to main content

MCCNado: Rust-based tools for use in processing Micro-Capure-C data using SeqNado

Project description

MCCNado

A high-performance Rust library with Python bindings for processing Micro-Capture-C (MCC) sequencing data.

Overview

MCCNado is a bioinformatics tool designed for analyzing chromatin conformation capture sequencing data. It provides efficient implementations for common preprocessing tasks including FASTQ deduplication, viewpoint read splitting, BAM annotation, and ligation junction analysis.

Features

  • FASTQ Deduplication: Remove duplicate reads from single-end and paired-end FASTQ files
  • Viewpoint Read Splitting: Split reads containing viewpoint sequences into constituent segments
  • BAM Annotation: Add metadata tags to BAM files for downstream analysis
  • Ligation Junction Identification: Extract and analyze chromatin interaction data
  • Ligation Statistics: Generate comprehensive statistics on cis/trans interactions
  • High Performance: Implemented in Rust with optional async processing for large datasets

Installation

From PyPI (recommended)

pip install mccnado

From Source

git clone https://github.com/yourusername/MCCNado.git
cd MCCNado
pip install .

Development Installation

git clone https://github.com/yourusername/MCCNado.git
cd MCCNado
pip install -e .

Requirements

  • Python 3.7+
  • Rust (for building from source)
  • samtools (for BAM file processing)

Usage

Python API

import mccnado

# Deduplicate FASTQ files
stats = mccnado.deduplicate_fastq(
    fastq1="input_R1.fastq.gz",
    output1="output_R1.fastq.gz",
    fastq2="input_R2.fastq.gz",  # Optional for paired-end
    output2="output_R2.fastq.gz"  # Optional for paired-end
)

print(f"Total reads: {stats['total_reads']}")
print(f"Unique reads: {stats['unique_reads']}")
print(f"Duplicate reads: {stats['duplicate_reads']}")

# Split viewpoint reads
mccnado.split_viewpoint_reads(
    bam="aligned_reads.bam",
    output="split_reads.fastq.gz"
)

# Annotate BAM file with MCC metadata
mccnado.annotate_bam(
    bam="input.bam",
    output_directory="annotated_output/"
)

# Extract ligation statistics
mccnado.extract_ligation_stats(
    bam="annotated.bam",
    stats="ligation_stats.json"
)

Command Line Interface

The package also provides a command-line interface through the Python module:

# Deduplicate FASTQ files
python -m mccnado.cli deduplicate input_R1.fastq.gz output_R1.fastq.gz

# Split viewpoint reads
python -m mccnado.cli split-reads aligned_reads.bam split_reads.fastq.gz

# Annotate BAM files
python -m mccnado.cli annotate input.bam output_directory/

# Extract ligation statistics
python -m mccnado.cli ligation-stats annotated.bam stats.json

File Formats

Input Files

  • FASTQ: Raw sequencing reads (single-end or paired-end, gzipped or uncompressed)
  • BAM: Aligned reads with proper headers and indexing

Output Files

  • FASTQ: Deduplicated reads
  • BAM: Annotated alignment files with MCC-specific tags
  • JSON: Ligation statistics and metadata

BAM Tags Added by MCCNado

  • VP: Viewpoint name
  • OC: Oligo coordinates
  • RT: Reporter tag (0 for capture reads, 1 for reporter reads)

Performance

MCCNado is optimized for large-scale data processing:

  • Memory Efficient: Streaming processing for large files
  • Parallel Processing: Multi-threaded operations where applicable
  • Fast Hashing: Uses xxHash for rapid duplicate detection
  • Batch Processing: Configurable batch sizes for optimal performance

Architecture

The package consists of several core modules:

Development

Building from Source

# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Clone and build
git clone https://github.com/yourusername/MCCNado.git
cd MCCNado
cargo build --release

# Install Python package
pip install -e .

Running Tests

# Rust tests
cargo test

# Python tests
python -m pytest tests/

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Citation

If you use MCCNado in your research, please cite:

[Your Citation Here]

Support

For questions, issues, or feature requests, please:

  1. Check the documentation
  2. Search existing issues
  3. Open a new issue if needed

Acknowledgments

  • Built with PyO3 for Python-Rust interoperability
  • Uses noodles for bioinformatics file format handling
  • Powered by tokio for async operations

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mccnado-0.1.3.tar.gz (63.4 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mccnado-0.1.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (700.8 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

mccnado-0.1.3-pp311-pypy311_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (742.2 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ i686

mccnado-0.1.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (701.2 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ x86-64

mccnado-0.1.3-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl (742.2 kB view details)

Uploaded PyPymanylinux: glibc 2.17+ i686

mccnado-0.1.3-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (700.3 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ x86-64

mccnado-0.1.3-cp314-cp314-manylinux_2_17_i686.manylinux2014_i686.whl (740.0 kB view details)

Uploaded CPython 3.14manylinux: glibc 2.17+ i686

mccnado-0.1.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (699.3 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ x86-64

mccnado-0.1.3-cp313-cp313-manylinux_2_17_i686.manylinux2014_i686.whl (739.7 kB view details)

Uploaded CPython 3.13manylinux: glibc 2.17+ i686

mccnado-0.1.3-cp313-cp313-macosx_11_0_arm64.whl (627.6 kB view details)

Uploaded CPython 3.13macOS 11.0+ ARM64

mccnado-0.1.3-cp313-cp313-macosx_10_12_x86_64.whl (674.8 kB view details)

Uploaded CPython 3.13macOS 10.12+ x86-64

mccnado-0.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (699.5 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ x86-64

mccnado-0.1.3-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl (740.2 kB view details)

Uploaded CPython 3.12manylinux: glibc 2.17+ i686

mccnado-0.1.3-cp312-cp312-macosx_11_0_arm64.whl (627.6 kB view details)

Uploaded CPython 3.12macOS 11.0+ ARM64

mccnado-0.1.3-cp312-cp312-macosx_10_12_x86_64.whl (674.7 kB view details)

Uploaded CPython 3.12macOS 10.12+ x86-64

mccnado-0.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (700.2 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ x86-64

mccnado-0.1.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (741.6 kB view details)

Uploaded CPython 3.11manylinux: glibc 2.17+ i686

mccnado-0.1.3-cp311-cp311-macosx_11_0_arm64.whl (629.7 kB view details)

Uploaded CPython 3.11macOS 11.0+ ARM64

mccnado-0.1.3-cp311-cp311-macosx_10_12_x86_64.whl (678.1 kB view details)

Uploaded CPython 3.11macOS 10.12+ x86-64

mccnado-0.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (700.4 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ x86-64

mccnado-0.1.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (741.8 kB view details)

Uploaded CPython 3.10manylinux: glibc 2.17+ i686

File details

Details for the file mccnado-0.1.3.tar.gz.

File metadata

  • Download URL: mccnado-0.1.3.tar.gz
  • Upload date:
  • Size: 63.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.9.1

File hashes

Hashes for mccnado-0.1.3.tar.gz
Algorithm Hash digest
SHA256 1c5c53579b1cbbb6d5efaae75a3c7f4917d947722392eab5283cd26c2f8ab3de
MD5 afd24f9ec5d56731db525221d2e25851
BLAKE2b-256 c276916d796d2eed7f8cf8d92a02ef9149c67f55c201d6f963a31314b60ad3da

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1e028a720d11113e7e3d87a3996550fd7fbc3d5052f2f7af5c01e8d5500bac89
MD5 01bf63e410d071ac31fbf8eb18ab0392
BLAKE2b-256 3eb932054287c08c0d6a7ad17711ee1ddd6472b514fdc2dfb97131f346130a46

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-pp311-pypy311_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-pp311-pypy311_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 854783818dd5bd3cf8016b1739b9f3e50ff842966a8177052fe9f30931a69938
MD5 9cc94b69c650bb84e763936fabd87fab
BLAKE2b-256 517514915c3c13d8b3fc967d405b9032d2f31ba9233d1893adc688af0d10281f

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 77aedb56011b1f36c9d2837bc3ce6d022fe29cd9a78e5e872d2736636166f57e
MD5 33602f94b69203dd67c646babdb71f1f
BLAKE2b-256 23c1cedfdba483c4a6ae1281e2f6b40f364e8296b4239968407e30019201729c

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 d141d0c25701f7c6c52f883c199550ab33d8cc0983115f03923107cf30a660ea
MD5 2a17ff224b7dd40c10a2f9d15535045f
BLAKE2b-256 e0a8feb49376c89adf07c2bc5a7bb26072d4d1c86d5977980aa6428757efc4aa

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 26776c0c9dab7ba513506503d8c23a7884ba02bf1d19b404cce7199dd79db2c3
MD5 76dbed2202333601db501891739f2472
BLAKE2b-256 395bb85936d0ebf7ec86f0459214fb0e1ad08e7b1a4bb81e9d1f766e3e99bbcd

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp314-cp314-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp314-cp314-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 fc037688bd27e50c2ba3dab71569c88fec6e3fbfa46df03b6d17a827ad69038b
MD5 a6c22854f9c70cc54972fc9c6d534d48
BLAKE2b-256 56bea41aca800faf85fa709ad44ce812e8bfaa9abd9b1325babbb42213143a04

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4acab9d8b468ee62046f44d07babae63ba089c5efa4c812dee9e3487f80183f8
MD5 e7fb52e468d1d0595fe7b10c2fdafac4
BLAKE2b-256 960f9ac1b96b848b7c682496ed41d431a28a897acd4e07113f6eb94da67e055d

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp313-cp313-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp313-cp313-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 ad889ee325c8aee6b61ce9651f8f97ed2520942b0fb748be28855b8f43e9e213
MD5 ff08fb8d653f4f08ef41eee1488c84e8
BLAKE2b-256 bf96e7d8920bb00549b371479dbdd307288c619a2805a5f6d0b6af120d200ca6

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp313-cp313-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp313-cp313-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 fd40b101b5de739eae338c089a8ed0ad861f0d1f95b7e1a34a3763b7d48b0fec
MD5 00650ac631b3273d0bd7ccec642069e8
BLAKE2b-256 fbfc3bf4e9363b6dbc1901c15d2f889514b17ff1688c17f8b93cf1ff1a12c1c8

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp313-cp313-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp313-cp313-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 f153b5a3694fab8e3344874b8a74eed2eec50359afe0b9240f67e43e95d1d996
MD5 d713180498b8675dd1c690d14de2410a
BLAKE2b-256 11144b5dd158849739318c6d9a82cd8b93b3ddd9e2dc980597203ef360d37de7

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 d5eb2dc02135ae7ddf3814f1e73a8a0bd52fbecae31da2211b189996fb502700
MD5 b2e8c7276410ace954d326e467f375e0
BLAKE2b-256 ad93fa970139f1c23842266d35df5bf4b1acfa9c35837089328c915e2419603a

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 126024b2f17d96124dbd15453602ef75e4d203fdee4879cbb6c278e35756b0ca
MD5 aafbcd33fc8fd5ca63a0cb96ba541194
BLAKE2b-256 54ea07aed4dd8513908592404d31d986d8f2412793585332f7838e392f8178be

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 cb40cb06b9c6658b4e5f39441dd02ce1e291a1ac01135628c8e44a0f346d1065
MD5 e1cc63f0a0449cc20164634428d45744
BLAKE2b-256 5ead9b9006a8470be85f46f76ebe8d6680ae00f9558a5fe33ac6af52f68a2efa

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 d0df5367188cd8090c792e824a54dbaa87d26df2bb2db3f232efc1af16bc3fef
MD5 4844e1fc59fd11fcef8b781626dfc29f
BLAKE2b-256 e2073f7d05b180bb86a213dafea8e1c02c3a34e9d29ffcd3413cc833e47b3d1f

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 aefc87719f9addae2772da8701b8a2980c321becbcb6f38d10bd0b56af1b18a3
MD5 5e74fd087e947c9c9e0a1f6332a17148
BLAKE2b-256 98b51091a6c22de1a358a7442242f76fc4dfe83ec7e61803f644e89a45a1dccd

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 b4f85d543340d5fe38c32c687560199cde5cd6adde6e1ad2fdd631b16b98f644
MD5 7d95a3513a3fc161af47834eb7135262
BLAKE2b-256 35921da28526eb64d516784a062d8c128885ab7dc2bb0cbd51b6c1e0d65803b3

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9ae273c735ec762359409f60e6372b0a14fa6d6aa5dda73ec9de10fe8a60de90
MD5 09745ebf06e27b4c1e7d78edb9f0b8c3
BLAKE2b-256 8e14d44300621adb2fbd9321f96f495d6eebebe0cc17d657afff648572f6841d

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3ae07352180df98dbc38cca66bd8c6ae9b2b34e8bd3d73a773b4b178fb8c5752
MD5 b915f83470d501de2890060884724673
BLAKE2b-256 834afa54274715d44f8ca7da277dfa43054dd5c0d85b74261f5bdbb3b8653983

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 529ab760833bbb3613d489865095a441c78727fd8bbbdd2fefb85bffeefc9a7b
MD5 b55a9a209b912c4b15ce66c646cf0d6f
BLAKE2b-256 d07f6e19871f20f0de818fba99b03c1fc98c5ab8ea8bba7610fc98a7c78bb2ba

See more details on using hashes here.

File details

Details for the file mccnado-0.1.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for mccnado-0.1.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 e2181aee50d6ea82ee09b787d59e128dfd150c4fb0f8fe04c18d2062ca0ffa03
MD5 13bbdef3a9ce1d2d0260d422e7031291
BLAKE2b-256 c90836eaeb9b3e1070ca2cabfca118e514177ef2dff36dfba87019f5a5858ee3

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page