Skip to main content

ChimeraLM: A genomic lanuage model to identify chimera artifact introduced by whole genome amplification (WGA).

Project description

ChimeraLM Logo

ChimeraLM

A Genomic Language Model for Detecting WGA Chimeric Artifacts

python pypi pyversion download ruff

release stars activity lastcommit

InstallationQuick StartDocumentationCitation


A deep learning-powered tool to identify chimeric artifacts introduced by whole genome amplification (WGA).

Installation

pip install chimeralm

Requirements: Python 3.10, 3.11 and 3.12

For GPU support, installation instructions, and troubleshooting, see the Installation Guide.

Quick Start

# Predict chimeric reads (CPU)
chimeralm predict your_data.bam

# Predict with GPU acceleration
chimeralm predict your_data.bam --gpus 1 --batch-size 24

# Filter BAM to remove chimeric reads
chimeralm filter your_data.bam your_data.predictions

Output:

  • Predictions: Tab-separated file with read names and labels (0=biological, 1=chimeric)
  • Filtered BAM: {input}.filtered.sorted.bam with chimeric reads removed

Need more help? See the Quick Start Tutorial for a complete walkthrough.

Documentation

Full documentation is available at ylab-hi.github.io/ChimeraLM

Key Resources:

Features

  • High Accuracy: Deep learning model trained on real WGA data
  • GPU Accelerated: Optimized for CUDA, MPS (Apple Silicon), and CPU
  • Easy to Use: Simple CLI with sensible defaults
  • Fast Processing: Batch inference with configurable parallelism
  • Web Interface: Interactive web UI for visualization and analysis
  • Production Ready: Includes filtering, sorting, and indexing of BAM files

Contributing

Contributions are welcome! See our Contributing Guide for development setup and guidelines.

Citation

If you use ChimeraLM in your research, please cite:

@software{chimeralm2025,
  title={ChimeraLM: A genomic language model to identify chimera artifacts},
  author={Li, Yangyang and Guo, Qingxiang and Yang, Rendong},
  year={2025},
  url={https://github.com/ylab-hi/ChimeraLM}
}

License

Apache License 2.0 - see LICENSE for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chimeralm-1.0.4.tar.gz (12.6 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

chimeralm-1.0.4-py3-none-any.whl (55.8 kB view details)

Uploaded Python 3

File details

Details for the file chimeralm-1.0.4.tar.gz.

File metadata

  • Download URL: chimeralm-1.0.4.tar.gz
  • Upload date:
  • Size: 12.6 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.5

File hashes

Hashes for chimeralm-1.0.4.tar.gz
Algorithm Hash digest
SHA256 dfb86e85b5f7e051638e6d8f75363d065c4a51ad262f90219d1dcd71cfc5f25d
MD5 7039e864eb46923cccc2531042cbf78f
BLAKE2b-256 09a749325ef234d8f7f41842eceab409d77eb8de62499e0b5253f890491c9fe8

See more details on using hashes here.

File details

Details for the file chimeralm-1.0.4-py3-none-any.whl.

File metadata

  • Download URL: chimeralm-1.0.4-py3-none-any.whl
  • Upload date:
  • Size: 55.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.5

File hashes

Hashes for chimeralm-1.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 fa6cf8b655a29cbb59e9a0bdaf37f48e8e8fd349e7553c1f78a2a37a709127d8
MD5 89fa2c258fb758270790eb4f3ae8ce74
BLAKE2b-256 8bc06ae38f7f4fa9e6ce3f7f596d0760d4017e34696eb411ec01d534ac5951f1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page