Skip to main content

Genomic language model mitigates chimera artifacts in nanopore direct RNA sequencing

Project description

logo DeepChopper social

pypi PyPI - Wheel license pypi version platform Actions status Space

🧬 DeepChopper leverages a language model to accurately detect and chop artificial sequences that may cause chimeric reads, ensuring higher quality and more reliable sequencing results. By integrating seamlessly with existing workflows, DeepChopper provides a robust solution for researchers and bioinformaticians working with Nanopore direct-RNA sequencing data.

✨ What's New in v1.3.0

  • 🚀 Direct FASTQ Processing: No more encoding step! DeepChopper now works directly with FASTQ files
  • ⚡ Simplified Workflow: Go from raw data to results in just 2 commands (predictchop)
  • 📦 Auto-format Detection: Automatically handles .fastq, .fq, .fastq.gz, and .fq.gz files
  • ⚠️ Breaking Change: The encode command has been removed - update your pipelines accordingly

See full changelog →

📘 FEATURED: We provide a comprehensive tutorial that includes an example dataset in our full documentation.

🚀 Quick Start: Try DeepChopper Online

Experience DeepChopper instantly through our user-friendly web interface. No installation required! Simply click the button below to launch the web application and start exploring DeepChopper's capabilities:

Open in Hugging Face Spaces

What you can do online:

  • 📤 Upload your sequencing data
  • 🔬 Run DeepChopper's analysis
  • 📊 Visualize results
  • 🎛️ Experiment with different parameters

Perfect for quick tests or demonstrations! However, for extensive analyses or custom workflows, we recommend installing DeepChopper locally.

⚠️ Note: The online version is limited to one FASTQ record at a time and may not be suitable for large-scale projects.

📦 Installation

DeepChopper can be installed using pip, the Python package installer. Follow these steps to install:

  1. Ensure you have Python 3.10 or later installed on your system.

  2. Create a virtual environment (recommended):

    python -m venv deepchopper_env
    source deepchopper_env/bin/activate  # On Windows use `deepchopper_env\Scripts\activate`
    
  3. Install DeepChopper:

    pip install deepchopper
    
  4. Verify the installation:

    deepchopper --help
    

Compatibility and Support

DeepChopper is designed to work across various platforms and Python versions. Below are the compatibility matrices for PyPI installations:

PyPI Support

Python Version Linux x86_64 macOS Intel macOS Apple Silicon Windows x86_64
3.10
3.11
3.12

🆘 Trouble installing? Check our Troubleshooting Guide or open an issue.

🛠️ Usage

For a comprehensive guide, check out our full tutorial. Here's a quick overview:

Command-Line Interface

🎉 New in v1.3.0: DeepChopper now works directly with FASTQ files! No encoding step required.

DeepChopper offers two main commands: predict and chop.

  1. Predict chimera artifacts directly from FASTQ:

    deepchopper predict input.fastq --output predictions
    

    Using GPUs? Add the --gpus flag:

    deepchopper predict input.fastq --output predictions --gpus 2
    

    Supports all FASTQ formats: .fastq, .fq, .fastq.gz, .fq.gz

  2. Chop chimera artifacts:

    deepchopper chop predictions/0 input.fastq
    

Want a GUI? Launch the web interface (note: limited to one FASTQ record at a time):

deepchopper web

Python Library

Integrate DeepChopper into your Python scripts:

import deepchopper

model = deepchopper.DeepChopper.from_pretrained("yangliz5/deepchopper")
# Your analysis code here

📚 Cite

If DeepChopper aids your research, please cite our paper:

@article{li2026genomic,
  title = {Genomic Language Model Mitigates Chimera Artifacts in Nanopore Direct {{RNA}} Sequencing},
  author = {Li, Yangyang and Wang, Ting-You and Guo, Qingxiang and Ren, Yanan and Lu, Xiaotong and Cao, Qi and Yang, Rendong},
  date = {2026-01-19},
  journaltitle = {Nature Communications},
  shortjournal = {Nat Commun},
  publisher = {Nature Publishing Group},
  issn = {2041-1723},
  doi = {10.1038/s41467-026-68571-5},
  url = {https://www.nature.com/articles/s41467-026-68571-5},
  urldate = {2026-01-20}
}

🤝 Contribution

We welcome contributions! Here's how to set up your development environment:

Build Environment

Install UV and Rust

git clone https://github.com/ylab-hi/DeepChopper.git
cd DeepChopper

# Install dependencies
uv sync

# Run DeepChopper
uv run deepchopper --help

🎉 Ready to contribute? Check out our Contribution Guidelines to get started!

🔗 Related Projects

  • ChimeraLM - Identify artificial chimeric reads from whole genome amplification (WGA) processes

📬 Support

Need help? Have questions?


DeepChopper is developed with ❤️ by the YLab team. Happy sequencing! 🧬🔬

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepchopper-1.3.4.dev0.tar.gz (58.8 MB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

deepchopper-1.3.4.dev0-cp310-abi3-win_amd64.whl (5.8 MB view details)

Uploaded CPython 3.10+Windows x86-64

deepchopper-1.3.4.dev0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.6 MB view details)

Uploaded CPython 3.10+manylinux: glibc 2.17+ x86-64

deepchopper-1.3.4.dev0-cp310-abi3-macosx_11_0_arm64.whl (5.8 MB view details)

Uploaded CPython 3.10+macOS 11.0+ ARM64

deepchopper-1.3.4.dev0-cp310-abi3-macosx_10_12_x86_64.whl (6.3 MB view details)

Uploaded CPython 3.10+macOS 10.12+ x86-64

File details

Details for the file deepchopper-1.3.4.dev0.tar.gz.

File metadata

  • Download URL: deepchopper-1.3.4.dev0.tar.gz
  • Upload date:
  • Size: 58.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.13.3

File hashes

Hashes for deepchopper-1.3.4.dev0.tar.gz
Algorithm Hash digest
SHA256 fe118257b086e93e209b22104285a9c774c3b1df68ef6515f7d9097fa87c187f
MD5 e686d65bf080bc3d940bab685d1009a8
BLAKE2b-256 85b1a38ac28d1791d1836179c385989e20845eb333b8e7b1fc2f5979bf0ba916

See more details on using hashes here.

File details

Details for the file deepchopper-1.3.4.dev0-cp310-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for deepchopper-1.3.4.dev0-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 2870c6c2cf26bcd21ae3b83486747f1b2b32e17709919fe420e762a8c4fd7c01
MD5 60c5328d45950fa51abfa5f5c22feae3
BLAKE2b-256 d81cb7f00ac741f3c188aeafa75650dc3501058dbfeda669b02ba3de9742eee7

See more details on using hashes here.

File details

Details for the file deepchopper-1.3.4.dev0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for deepchopper-1.3.4.dev0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b9126f84ab2216421b0a65a55233d5767c51d218140eeb687c83ca1fcc0a4374
MD5 e18dd369da211f3f243de1425dcada5f
BLAKE2b-256 4c308c67fe5d3da171ba2d7b0d48603dadd9d3d26a94dc1102c82b456a648ec1

See more details on using hashes here.

File details

Details for the file deepchopper-1.3.4.dev0-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for deepchopper-1.3.4.dev0-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 9f9c203d845085eb84333e16018951d4274b8399bc9fcd4c09e26e26a41cd718
MD5 6798529cb4502ad1c8ef3d275229a4e7
BLAKE2b-256 609c3df97be0eed8796947b898ad761d506bdeb9136f873517dfa9d8c83412c6

See more details on using hashes here.

File details

Details for the file deepchopper-1.3.4.dev0-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for deepchopper-1.3.4.dev0-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 1471da3f0f5414e5a3cfb6635ebc23af054c54560f9821dafea668372886b3b8
MD5 9da8f636bc8aa7f9eed419217b548c9d
BLAKE2b-256 878785bd49ac15496617de9aa3ca7840b7167247868568001c167a7a5446b70e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page