MCCNado: Rust-based tools for use in processing Micro-Capure-C data using SeqNado
Project description
MCCNado
A high-performance Rust library with Python bindings for processing Micro-Capture-C (MCC) sequencing data.
Overview
MCCNado is a bioinformatics tool designed for analyzing chromatin conformation capture sequencing data. It provides efficient implementations for common preprocessing tasks including FASTQ deduplication, viewpoint read splitting, BAM annotation, and ligation junction analysis.
Features
- FASTQ Deduplication: Remove duplicate reads from single-end and paired-end FASTQ files
- Viewpoint Read Splitting: Split reads containing viewpoint sequences into constituent segments
- BAM Annotation: Add metadata tags to BAM files for downstream analysis
- Ligation Junction Identification: Extract and analyze chromatin interaction data
- Ligation Statistics: Generate comprehensive statistics on cis/trans interactions
- High Performance: Implemented in Rust with optional async processing for large datasets
Installation
From PyPI (recommended)
pip install mccnado
From Source
git clone https://github.com/yourusername/MCCNado.git
cd MCCNado
pip install .
Development Installation
git clone https://github.com/yourusername/MCCNado.git
cd MCCNado
pip install -e .
Requirements
- Python 3.7+
- Rust (for building from source)
- samtools (for BAM file processing)
Usage
Python API
import mccnado
# Deduplicate FASTQ files
stats = mccnado.deduplicate_fastq(
fastq1="input_R1.fastq.gz",
output1="output_R1.fastq.gz",
fastq2="input_R2.fastq.gz", # Optional for paired-end
output2="output_R2.fastq.gz" # Optional for paired-end
)
print(f"Total reads: {stats['total_reads']}")
print(f"Unique reads: {stats['unique_reads']}")
print(f"Duplicate reads: {stats['duplicate_reads']}")
# Split viewpoint reads
mccnado.split_viewpoint_reads(
bam="aligned_reads.bam",
output="split_reads.fastq.gz"
)
# Annotate BAM file with MCC metadata
mccnado.annotate_bam(
bam="input.bam",
output_directory="annotated_output/"
)
# Extract ligation statistics
mccnado.extract_ligation_stats(
bam="annotated.bam",
stats="ligation_stats.json"
)
Command Line Interface
The package also provides a command-line interface through the Python module:
# Deduplicate FASTQ files
python -m mccnado.cli deduplicate input_R1.fastq.gz output_R1.fastq.gz
# Split viewpoint reads
python -m mccnado.cli split-reads aligned_reads.bam split_reads.fastq.gz
# Annotate BAM files
python -m mccnado.cli annotate input.bam output_directory/
# Extract ligation statistics
python -m mccnado.cli ligation-stats annotated.bam stats.json
File Formats
Input Files
- FASTQ: Raw sequencing reads (single-end or paired-end, gzipped or uncompressed)
- BAM: Aligned reads with proper headers and indexing
Output Files
- FASTQ: Deduplicated reads
- BAM: Annotated alignment files with MCC-specific tags
- JSON: Ligation statistics and metadata
BAM Tags Added by MCCNado
VP: Viewpoint nameOC: Oligo coordinatesRT: Reporter tag (0 for capture reads, 1 for reporter reads)
Performance
MCCNado is optimized for large-scale data processing:
- Memory Efficient: Streaming processing for large files
- Parallel Processing: Multi-threaded operations where applicable
- Fast Hashing: Uses xxHash for rapid duplicate detection
- Batch Processing: Configurable batch sizes for optimal performance
Architecture
The package consists of several core modules:
fastq_deduplicate: FASTQ deduplication logicviewpoint_read_splitter: Read segmentation functionalitymcc_data_handler: BAM annotation and processingligation_stats: Statistical analysis of ligation eventsutils: Common utilities and data structures
Development
Building from Source
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Clone and build
git clone https://github.com/yourusername/MCCNado.git
cd MCCNado
cargo build --release
# Install Python package
pip install -e .
Running Tests
# Rust tests
cargo test
# Python tests
python -m pytest tests/
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
Citation
If you use MCCNado in your research, please cite:
[Your Citation Here]
Support
For questions, issues, or feature requests, please:
- Check the documentation
- Search existing issues
- Open a new issue if needed
Acknowledgments
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file mccnado-0.1.3.tar.gz.
File metadata
- Download URL: mccnado-0.1.3.tar.gz
- Upload date:
- Size: 63.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1c5c53579b1cbbb6d5efaae75a3c7f4917d947722392eab5283cd26c2f8ab3de
|
|
| MD5 |
afd24f9ec5d56731db525221d2e25851
|
|
| BLAKE2b-256 |
c276916d796d2eed7f8cf8d92a02ef9149c67f55c201d6f963a31314b60ad3da
|
File details
Details for the file mccnado-0.1.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: mccnado-0.1.3-pp311-pypy311_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 700.8 kB
- Tags: PyPy, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1e028a720d11113e7e3d87a3996550fd7fbc3d5052f2f7af5c01e8d5500bac89
|
|
| MD5 |
01bf63e410d071ac31fbf8eb18ab0392
|
|
| BLAKE2b-256 |
3eb932054287c08c0d6a7ad17711ee1ddd6472b514fdc2dfb97131f346130a46
|
File details
Details for the file mccnado-0.1.3-pp311-pypy311_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: mccnado-0.1.3-pp311-pypy311_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 742.2 kB
- Tags: PyPy, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
854783818dd5bd3cf8016b1739b9f3e50ff842966a8177052fe9f30931a69938
|
|
| MD5 |
9cc94b69c650bb84e763936fabd87fab
|
|
| BLAKE2b-256 |
517514915c3c13d8b3fc967d405b9032d2f31ba9233d1893adc688af0d10281f
|
File details
Details for the file mccnado-0.1.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: mccnado-0.1.3-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 701.2 kB
- Tags: PyPy, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
77aedb56011b1f36c9d2837bc3ce6d022fe29cd9a78e5e872d2736636166f57e
|
|
| MD5 |
33602f94b69203dd67c646babdb71f1f
|
|
| BLAKE2b-256 |
23c1cedfdba483c4a6ae1281e2f6b40f364e8296b4239968407e30019201729c
|
File details
Details for the file mccnado-0.1.3-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: mccnado-0.1.3-pp310-pypy310_pp73-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 742.2 kB
- Tags: PyPy, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d141d0c25701f7c6c52f883c199550ab33d8cc0983115f03923107cf30a660ea
|
|
| MD5 |
2a17ff224b7dd40c10a2f9d15535045f
|
|
| BLAKE2b-256 |
e0a8feb49376c89adf07c2bc5a7bb26072d4d1c86d5977980aa6428757efc4aa
|
File details
Details for the file mccnado-0.1.3-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp314-cp314-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 700.3 kB
- Tags: CPython 3.14, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
26776c0c9dab7ba513506503d8c23a7884ba02bf1d19b404cce7199dd79db2c3
|
|
| MD5 |
76dbed2202333601db501891739f2472
|
|
| BLAKE2b-256 |
395bb85936d0ebf7ec86f0459214fb0e1ad08e7b1a4bb81e9d1f766e3e99bbcd
|
File details
Details for the file mccnado-0.1.3-cp314-cp314-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: mccnado-0.1.3-cp314-cp314-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 740.0 kB
- Tags: CPython 3.14, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fc037688bd27e50c2ba3dab71569c88fec6e3fbfa46df03b6d17a827ad69038b
|
|
| MD5 |
a6c22854f9c70cc54972fc9c6d534d48
|
|
| BLAKE2b-256 |
56bea41aca800faf85fa709ad44ce812e8bfaa9abd9b1325babbb42213143a04
|
File details
Details for the file mccnado-0.1.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp313-cp313-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 699.3 kB
- Tags: CPython 3.13, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4acab9d8b468ee62046f44d07babae63ba089c5efa4c812dee9e3487f80183f8
|
|
| MD5 |
e7fb52e468d1d0595fe7b10c2fdafac4
|
|
| BLAKE2b-256 |
960f9ac1b96b848b7c682496ed41d431a28a897acd4e07113f6eb94da67e055d
|
File details
Details for the file mccnado-0.1.3-cp313-cp313-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: mccnado-0.1.3-cp313-cp313-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 739.7 kB
- Tags: CPython 3.13, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ad889ee325c8aee6b61ce9651f8f97ed2520942b0fb748be28855b8f43e9e213
|
|
| MD5 |
ff08fb8d653f4f08ef41eee1488c84e8
|
|
| BLAKE2b-256 |
bf96e7d8920bb00549b371479dbdd307288c619a2805a5f6d0b6af120d200ca6
|
File details
Details for the file mccnado-0.1.3-cp313-cp313-macosx_11_0_arm64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp313-cp313-macosx_11_0_arm64.whl
- Upload date:
- Size: 627.6 kB
- Tags: CPython 3.13, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
fd40b101b5de739eae338c089a8ed0ad861f0d1f95b7e1a34a3763b7d48b0fec
|
|
| MD5 |
00650ac631b3273d0bd7ccec642069e8
|
|
| BLAKE2b-256 |
fbfc3bf4e9363b6dbc1901c15d2f889514b17ff1688c17f8b93cf1ff1a12c1c8
|
File details
Details for the file mccnado-0.1.3-cp313-cp313-macosx_10_12_x86_64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp313-cp313-macosx_10_12_x86_64.whl
- Upload date:
- Size: 674.8 kB
- Tags: CPython 3.13, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f153b5a3694fab8e3344874b8a74eed2eec50359afe0b9240f67e43e95d1d996
|
|
| MD5 |
d713180498b8675dd1c690d14de2410a
|
|
| BLAKE2b-256 |
11144b5dd158849739318c6d9a82cd8b93b3ddd9e2dc980597203ef360d37de7
|
File details
Details for the file mccnado-0.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 699.5 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d5eb2dc02135ae7ddf3814f1e73a8a0bd52fbecae31da2211b189996fb502700
|
|
| MD5 |
b2e8c7276410ace954d326e467f375e0
|
|
| BLAKE2b-256 |
ad93fa970139f1c23842266d35df5bf4b1acfa9c35837089328c915e2419603a
|
File details
Details for the file mccnado-0.1.3-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: mccnado-0.1.3-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 740.2 kB
- Tags: CPython 3.12, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
126024b2f17d96124dbd15453602ef75e4d203fdee4879cbb6c278e35756b0ca
|
|
| MD5 |
aafbcd33fc8fd5ca63a0cb96ba541194
|
|
| BLAKE2b-256 |
54ea07aed4dd8513908592404d31d986d8f2412793585332f7838e392f8178be
|
File details
Details for the file mccnado-0.1.3-cp312-cp312-macosx_11_0_arm64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp312-cp312-macosx_11_0_arm64.whl
- Upload date:
- Size: 627.6 kB
- Tags: CPython 3.12, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb40cb06b9c6658b4e5f39441dd02ce1e291a1ac01135628c8e44a0f346d1065
|
|
| MD5 |
e1cc63f0a0449cc20164634428d45744
|
|
| BLAKE2b-256 |
5ead9b9006a8470be85f46f76ebe8d6680ae00f9558a5fe33ac6af52f68a2efa
|
File details
Details for the file mccnado-0.1.3-cp312-cp312-macosx_10_12_x86_64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp312-cp312-macosx_10_12_x86_64.whl
- Upload date:
- Size: 674.7 kB
- Tags: CPython 3.12, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d0df5367188cd8090c792e824a54dbaa87d26df2bb2db3f232efc1af16bc3fef
|
|
| MD5 |
4844e1fc59fd11fcef8b781626dfc29f
|
|
| BLAKE2b-256 |
e2073f7d05b180bb86a213dafea8e1c02c3a34e9d29ffcd3413cc833e47b3d1f
|
File details
Details for the file mccnado-0.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 700.2 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
aefc87719f9addae2772da8701b8a2980c321becbcb6f38d10bd0b56af1b18a3
|
|
| MD5 |
5e74fd087e947c9c9e0a1f6332a17148
|
|
| BLAKE2b-256 |
98b51091a6c22de1a358a7442242f76fc4dfe83ec7e61803f644e89a45a1dccd
|
File details
Details for the file mccnado-0.1.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: mccnado-0.1.3-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 741.6 kB
- Tags: CPython 3.11, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b4f85d543340d5fe38c32c687560199cde5cd6adde6e1ad2fdd631b16b98f644
|
|
| MD5 |
7d95a3513a3fc161af47834eb7135262
|
|
| BLAKE2b-256 |
35921da28526eb64d516784a062d8c128885ab7dc2bb0cbd51b6c1e0d65803b3
|
File details
Details for the file mccnado-0.1.3-cp311-cp311-macosx_11_0_arm64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp311-cp311-macosx_11_0_arm64.whl
- Upload date:
- Size: 629.7 kB
- Tags: CPython 3.11, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9ae273c735ec762359409f60e6372b0a14fa6d6aa5dda73ec9de10fe8a60de90
|
|
| MD5 |
09745ebf06e27b4c1e7d78edb9f0b8c3
|
|
| BLAKE2b-256 |
8e14d44300621adb2fbd9321f96f495d6eebebe0cc17d657afff648572f6841d
|
File details
Details for the file mccnado-0.1.3-cp311-cp311-macosx_10_12_x86_64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp311-cp311-macosx_10_12_x86_64.whl
- Upload date:
- Size: 678.1 kB
- Tags: CPython 3.11, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3ae07352180df98dbc38cca66bd8c6ae9b2b34e8bd3d73a773b4b178fb8c5752
|
|
| MD5 |
b915f83470d501de2890060884724673
|
|
| BLAKE2b-256 |
834afa54274715d44f8ca7da277dfa43054dd5c0d85b74261f5bdbb3b8653983
|
File details
Details for the file mccnado-0.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.
File metadata
- Download URL: mccnado-0.1.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 700.4 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
529ab760833bbb3613d489865095a441c78727fd8bbbdd2fefb85bffeefc9a7b
|
|
| MD5 |
b55a9a209b912c4b15ce66c646cf0d6f
|
|
| BLAKE2b-256 |
d07f6e19871f20f0de818fba99b03c1fc98c5ab8ea8bba7610fc98a7c78bb2ba
|
File details
Details for the file mccnado-0.1.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.
File metadata
- Download URL: mccnado-0.1.3-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
- Upload date:
- Size: 741.8 kB
- Tags: CPython 3.10, manylinux: glibc 2.17+ i686
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.9.1
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e2181aee50d6ea82ee09b787d59e128dfd150c4fb0f8fe04c18d2062ca0ffa03
|
|
| MD5 |
13bbdef3a9ce1d2d0260d422e7031291
|
|
| BLAKE2b-256 |
c90836eaeb9b3e1070ca2cabfca118e514177ef2dff36dfba87019f5a5858ee3
|