A Genomic Language Model for Chimera Artifact Detection in Nanopore Direct RNA Sequencing
Project description
DeepChopper
DeepChopper leverages language model to accurately detect and chop artificial sequences which may cause chimeric reads, ensuring higher quality and more reliable sequencing results. By integrating seamlessly with existing workflows, DeepChopper provides a robust solution for researchers and bioinformatics working with NanoPore direct-RNA sequencing data.
Install
DeepChopper can be installed using pip, the Python package installer. Follow these steps to install:
-
Ensure you have Python 3.10 or later installed on your system.
-
It's recommended to create a virtual environment:
python -m venv deepchopper_env source deepchopper_env/bin/activate # On Windows use `deepchopper_env\Scripts\activate`
-
Install DeepChopper:
pip install deepchopper
-
Verify the installation:
deepchopper --help
-
DeepChopper include a Rust command line tool for faster performance.
cargo install deepchopper-chop
For GPU support, ensure you have CUDA installed on your system, then install the GPU version:
pip install deepchopper[gpu]
Note: If you encounter any issues, please check our GitHub repository for troubleshooting guides or to report a problem.
Usage
We provide a complete guide on how to use DeepChopper for NanoPore direct-RNA sequencing data. Below is a brief overview of the command-line interface and library usage.
Command-Line Interface
DeepChopper provides a command-line interface (CLI) for easy access to its features. In total, there are three commands: encode
, predict
, and chop
.
DeepChopper can be used to encode, predict, and chop chimeric reads in direct-RNA sequencing data.
Firstly, we need to encode the input data using the encode
command, which will generate a .parquet
file.
deepchopper endcode <input.fq>
Next, we can use the predict
command to predict chimeric reads in the encoded data.
deepchopper predict <input.parquet> --ouput-path predictions
If you have GPUs, you can use the --gpus
flag to specify the GPU device.
deepchopper predict <input.parquet> --ouput-path predictions --gpus 2
Finally, we can use the chop
command to chop the chimeric reads in the input data.
deepchopper chop <predictions> raw.fq
Besides, DeepChopper provides a web-based user interface for users to interact with the tool. However, the web-based application can only take one FASTQ record at a time.
deepchopper web
Library
import deepchopper
model = deepchopper.DeepChopper.from_pretrained("yangliz5/deepchopper")
Cite
🤜 Contribution
Build Environment
git clone https://github.com/ylab-hi/DeepChopper.git
cd DeepChopper
conda env create -n environment.yaml
conda activate deepchopper
Install Dependencies
pip install pipx
pipx install --suffix @master git+https://github.com/python-poetry/poetry.git@master
poetry@master install
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file deepchopper-1.0.1.tar.gz
.
File metadata
- Download URL: deepchopper-1.0.1.tar.gz
- Upload date:
- Size: 75.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44c4018f77ac6d8a0914ed54540033bf8fed2548e7c00c5fac4e6a4980bceac3 |
|
MD5 | 3611ebcdedae5218de9f422a1e76d74e |
|
BLAKE2b-256 | 84481465b5493c4bccbdcef5ef5ba0562e037453b078feff98c63aaba9206826 |
File details
Details for the file deepchopper-1.0.1-cp310-abi3-win_amd64.whl
.
File metadata
- Download URL: deepchopper-1.0.1-cp310-abi3-win_amd64.whl
- Upload date:
- Size: 4.4 MB
- Tags: CPython 3.10+, Windows x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 91c96c27f3f507ca25bfdda18e2d54e5c4e3b6b26a5c15bb248bc46c41b0323e |
|
MD5 | 1e0773cf314489fd7deb681169513a0b |
|
BLAKE2b-256 | e7ff115baf89a3e8a1e52e5aa892c0f71e108d7a2e83625050f68752533cae8c |
File details
Details for the file deepchopper-1.0.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
.
File metadata
- Download URL: deepchopper-1.0.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
- Upload date:
- Size: 5.2 MB
- Tags: CPython 3.10+, manylinux: glibc 2.17+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0d9888a8bd8f1021a510ed82baa8b7097ae99fd0b2c06f9b0d62002a13987ebf |
|
MD5 | 7bb3f91cbeab61150618ea37d5d11908 |
|
BLAKE2b-256 | 7b4e03d0dcebcaf600f117d4f54d13739371424ea9b09f3dcc7d4cf362fd3756 |
File details
Details for the file deepchopper-1.0.1-cp310-abi3-macosx_11_0_arm64.whl
.
File metadata
- Download URL: deepchopper-1.0.1-cp310-abi3-macosx_11_0_arm64.whl
- Upload date:
- Size: 4.3 MB
- Tags: CPython 3.10+, macOS 11.0+ ARM64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | e90e0e7cc4df59d6b8d20653f9a27667388b0754841745495389946228136d19 |
|
MD5 | dfdb808f329c4b8045fb438a2e134fc6 |
|
BLAKE2b-256 | c8f370aceff76daaea43ee31a523c0ce45a751eb6e6c940ff336171f598bf350 |
File details
Details for the file deepchopper-1.0.1-cp310-abi3-macosx_10_12_x86_64.whl
.
File metadata
- Download URL: deepchopper-1.0.1-cp310-abi3-macosx_10_12_x86_64.whl
- Upload date:
- Size: 4.7 MB
- Tags: CPython 3.10+, macOS 10.12+ x86-64
- Uploaded using Trusted Publishing? No
- Uploaded via: maturin/1.7.4
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3a397a8d4fba27fe643ece7e787114a10633bafe2221aecb52d77a0cb3ceebbc |
|
MD5 | a921bf5d2da2578e4e847bcfc9198191 |
|
BLAKE2b-256 | a6c095b500b702d94a83fafd6c306c6d59bab79f8687d6e7ccc74c3f47f1d7f7 |