Skip to main content

A Genomic Language Model for Chimera Artifact Detection in Nanopore Direct RNA Sequencing

Project description

logo DeepChopper social

DeepChopper leverages language model to accurately detect and chop artificial sequences which may cause chimeric reads, ensuring higher quality and more reliable sequencing results. By integrating seamlessly with existing workflows, DeepChopper provides a robust solution for researchers and bioinformatics working with NanoPore direct-RNA sequencing data.

Install

DeepChopper can be installed using pip, the Python package installer. Follow these steps to install:

  1. Ensure you have Python 3.10 or later installed on your system.

  2. It's recommended to create a virtual environment:

    python -m venv deepchopper_env
    source deepchopper_env/bin/activate  # On Windows use `deepchopper_env\Scripts\activate`
    
  3. Install DeepChopper:

    pip install deepchopper
    
  4. Verify the installation:

    deepchopper --help
    
  5. DeepChopper include a Rust command line tool for faster performance.

cargo install deepchopper-chop

For GPU support, ensure you have CUDA installed on your system, then install the GPU version:

pip install deepchopper[gpu]

Note: If you encounter any issues, please check our GitHub repository for troubleshooting guides or to report a problem.

Usage

We provide a complete guide on how to use DeepChopper for NanoPore direct-RNA sequencing data. Below is a brief overview of the command-line interface and library usage.

Command-Line Interface

DeepChopper provides a command-line interface (CLI) for easy access to its features. In total, there are three commands: encode, predict, and chop. DeepChopper can be used to encode, predict, and chop chimeric reads in direct-RNA sequencing data.

Firstly, we need to encode the input data using the encode command, which will generate a .parquet file.

deepchopper endcode <input.fq>

Next, we can use the predict command to predict chimeric reads in the encoded data.

deepchopper predict <input.parquet> --ouput-path predictions

If you have GPUs, you can use the --gpus flag to specify the GPU device.

deepchopper predict <input.parquet> --ouput-path predictions --gpus 2

Finally, we can use the chop command to chop the chimeric reads in the input data.

deepchopper chop <predictions> raw.fq

Besides, DeepChopper provides a web-based user interface for users to interact with the tool. However, the web-based application can only take one FASTQ record at a time.

deepchopper web

Library

import deepchopper

model = deepchopper.DeepChopper.from_pretrained("yangliz5/deepchopper")

Cite

🤜 Contribution

Build Environment

git clone https://github.com/ylab-hi/DeepChopper.git
cd DeepChopper
conda env create -n environment.yaml
conda activate deepchopper

Install Dependencies

pip install pipx
pipx install --suffix @master git+https://github.com/python-poetry/poetry.git@master
poetry@master install

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

deepchopper-1.0.1.tar.gz (75.6 MB view details)

Uploaded Source

Built Distributions

deepchopper-1.0.1-cp310-abi3-win_amd64.whl (4.4 MB view details)

Uploaded CPython 3.10+ Windows x86-64

deepchopper-1.0.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (5.2 MB view details)

Uploaded CPython 3.10+ manylinux: glibc 2.17+ x86-64

deepchopper-1.0.1-cp310-abi3-macosx_11_0_arm64.whl (4.3 MB view details)

Uploaded CPython 3.10+ macOS 11.0+ ARM64

deepchopper-1.0.1-cp310-abi3-macosx_10_12_x86_64.whl (4.7 MB view details)

Uploaded CPython 3.10+ macOS 10.12+ x86-64

File details

Details for the file deepchopper-1.0.1.tar.gz.

File metadata

  • Download URL: deepchopper-1.0.1.tar.gz
  • Upload date:
  • Size: 75.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: maturin/1.7.4

File hashes

Hashes for deepchopper-1.0.1.tar.gz
Algorithm Hash digest
SHA256 44c4018f77ac6d8a0914ed54540033bf8fed2548e7c00c5fac4e6a4980bceac3
MD5 3611ebcdedae5218de9f422a1e76d74e
BLAKE2b-256 84481465b5493c4bccbdcef5ef5ba0562e037453b078feff98c63aaba9206826

See more details on using hashes here.

File details

Details for the file deepchopper-1.0.1-cp310-abi3-win_amd64.whl.

File metadata

File hashes

Hashes for deepchopper-1.0.1-cp310-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 91c96c27f3f507ca25bfdda18e2d54e5c4e3b6b26a5c15bb248bc46c41b0323e
MD5 1e0773cf314489fd7deb681169513a0b
BLAKE2b-256 e7ff115baf89a3e8a1e52e5aa892c0f71e108d7a2e83625050f68752533cae8c

See more details on using hashes here.

File details

Details for the file deepchopper-1.0.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for deepchopper-1.0.1-cp310-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 0d9888a8bd8f1021a510ed82baa8b7097ae99fd0b2c06f9b0d62002a13987ebf
MD5 7bb3f91cbeab61150618ea37d5d11908
BLAKE2b-256 7b4e03d0dcebcaf600f117d4f54d13739371424ea9b09f3dcc7d4cf362fd3756

See more details on using hashes here.

File details

Details for the file deepchopper-1.0.1-cp310-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for deepchopper-1.0.1-cp310-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 e90e0e7cc4df59d6b8d20653f9a27667388b0754841745495389946228136d19
MD5 dfdb808f329c4b8045fb438a2e134fc6
BLAKE2b-256 c8f370aceff76daaea43ee31a523c0ce45a751eb6e6c940ff336171f598bf350

See more details on using hashes here.

File details

Details for the file deepchopper-1.0.1-cp310-abi3-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for deepchopper-1.0.1-cp310-abi3-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 3a397a8d4fba27fe643ece7e787114a10633bafe2221aecb52d77a0cb3ceebbc
MD5 a921bf5d2da2578e4e847bcfc9198191
BLAKE2b-256 a6c095b500b702d94a83fafd6c306c6d59bab79f8687d6e7ccc74c3f47f1d7f7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page