A Genomic Language Model for Chimera Artifact Detection in Nanopore Direct RNA Sequencing
Project description
DeepChopper
DeepChopper leverages language model to accurately detect and chop artificial sequences which may cause chimeric reads, ensuring higher quality and more reliable sequencing results. By integrating seamlessly with existing workflows, DeepChopper provides a robust solution for researchers and bioinformatics working with NanoPore direct-RNA sequencing data.
Quick Start: Try DeepChopper Online
Experience DeepChopper instantly through our user-friendly web interface. No installation required!
Simply click the button below to launch the web application and start exploring DeepChopper's capabilities:
This online version provides a convenient way to:
- Upload your sequencing data
- Run DeepChopper's analysis
- Visualize results
- Experiment with different parameters
It's perfect for quick tests or when you want to showcase DeepChopper's functionality without local setup. However, for more extensive analyses or custom workflows, we recommend installing DeepChopper on your machine. Because the online version is limited to one FASTQ record at a time, it may not be suitable for large-scale projects.
Install
DeepChopper can be installed using pip, the Python package installer. Follow these steps to install:
-
Ensure you have Python 3.10 or later installed on your system.
-
It's recommended to create a virtual environment:
python -m venv deepchopper_env source deepchopper_env/bin/activate # On Windows use `deepchopper_env\Scripts\activate`
-
Install DeepChopper:
pip install deepchopper
-
Verify the installation:
deepchopper --help
Note: If you encounter any issues, please check our GitHub repository for troubleshooting guides or to report a problem.
Usage
We provide a complete guide on how to use DeepChopper for NanoPore direct-RNA sequencing data. Below is a brief overview of the command-line interface and library usage.
Command-Line Interface
DeepChopper provides a command-line interface (CLI) for easy access to its features. In total, there are three commands: encode
, predict
, and chop
.
DeepChopper can be used to encode, predict, and chop chimeric reads in direct-RNA sequencing data.
Firstly, we need to encode the input data using the encode
command, which will generate a .parquet
file.
deepchopper endcode <input.fq>
Next, we can use the predict
command to predict chimeric reads in the encoded data.
deepchopper predict <input.parquet> --ouput-path predictions
If you have GPUs, you can use the --gpus
flag to specify the GPU device.
deepchopper predict <input.parquet> --ouput-path predictions --gpus 2
Finally, we can use the chop
command to chop the chimeric reads in the input data.
deepchopper chop <predictions> raw.fq
Besides, DeepChopper provides a web-based user interface for users to interact with the tool. However, the web-based application can only take one FASTQ record at a time.
deepchopper web
Library
import deepchopper
model = deepchopper.DeepChopper.from_pretrained("yangliz5/deepchopper")
Cite
If you use DeepChopper in your research, please cite the following paper:
🤜 Contribution
Build Environment
git clone https://github.com/ylab-hi/DeepChopper.git
cd DeepChopper
conda env create -n environment.yaml
conda activate deepchopper
Install Dependencies
pip install pipx
pipx install --suffix @master git+https://github.com/python-poetry/poetry.git@master
poetry@master install
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file deepchopper-1.1.0.tar.gz
.
File metadata
- Download URL: deepchopper-1.1.0.tar.gz
- Upload date:
- Size: 75.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | ba39bfe9a0a3896941c46204d88f95b1166b6f048d489d6a0c355bb76184786d |
|
MD5 | 9e6b9447e44a9b952255fea763d144ae |
|
BLAKE2b-256 | 819cb328d95c1caa4595c21d15db9749728b767d792b4051ff6a61d74c3ded00 |
File details
Details for the file deepchopper-1.1.0-cp310-abi3-win_amd64.whl
.
File metadata
- Download URL: deepchopper-1.1.0-cp310-abi3-win_amd64.whl
- Upload date:
- Size: 4.4 MB
- Tags: CPython 3.10+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 54817c93185b95506f05e11be37ae65682a8f9e70c0e8f43910a0fa87834b7a2 |
|
MD5 | 4f6576587b0752e960ecbf6d7534ad24 |
|
BLAKE2b-256 | 992b4d48ec18de2c42e156ba63f5b96b9462ba590fcc70866fccf2068c213ea8 |
File details
Details for the file deepchopper-1.1.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: deepchopper-1.1.0-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 5.2 MB
- Tags: CPython 3.10+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 019cd81c1d660502559f305e143c324d1a1ab452c3149e856e53d0b9adad96e2 |
|
MD5 | 995e789e92fe9b8a3178f7c885ad0dcf |
|
BLAKE2b-256 | d210a9909de527d43736ffedd9c96bc95d4b8d4acf3d6fd2e3208156f474b67c |
File details
Details for the file deepchopper-1.1.0-cp310-abi3-macosx_11_0_arm64.whl
.
File metadata
- Download URL: deepchopper-1.1.0-cp310-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 4.3 MB
- Tags: CPython 3.10+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0e98a5aa1c23d73f46bbed6361d06cd962b330a37867d563c16a884f114a605b |
|
MD5 | 5e82b233b977623673de4dfc78e394bd |
|
BLAKE2b-256 | e7c0a15f6640470d88ef716efd4813ad43678d0bd0c1d1dfa179320a4c850763 |
File details
Details for the file deepchopper-1.1.0-cp310-abi3-macosx_10_12_x86_64.whl
.
File metadata
- Download URL: deepchopper-1.1.0-cp310-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 4.7 MB
- Tags: CPython 3.10+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | de843340086d6cf10472ac3a1f0f36bfa7933daa602626bb92d289b482c6a9ce |
|
MD5 | 7f1bfb931a7137b2670a499d98637f79 |
|
BLAKE2b-256 | ace97f140421e227d7600e7aaa2063124fea5c7e38a59a48e45b6a9e7d5ba932 |