ChimeraLM: A genomic lanuage model to identify chimera artifact introduced by whole genome amplification (WGA).
Project description
ChimeraLM
A Genomic Language Model for Detecting WGA Chimeric Artifacts
A deep learning-powered tool to identify chimeric artifacts introduced by whole genome amplification (WGA).
Installation
pip install chimeralm
Requirements: Python 3.10, 3.11 and 3.12
For GPU support, installation instructions, and troubleshooting, see the Installation Guide.
Quick Start
# Predict chimeric reads (CPU)
chimeralm predict your_data.bam
# Predict with GPU acceleration
chimeralm predict your_data.bam --gpus 1 --batch-size 24
# Filter BAM to remove chimeric reads
chimeralm filter your_data.bam your_data.predictions
Output:
- Predictions: Tab-separated file with read names and labels (0=biological, 1=chimeric)
- Filtered BAM:
{input}.filtered.sorted.bamwith chimeric reads removed
Need more help? See the Quick Start Tutorial for a complete walkthrough.
Documentation
Full documentation is available at ylab-hi.github.io/ChimeraLM
Key Resources:
- Installation Guide - Setup with pip, conda, uv, or from source
- Quick Start Tutorial - Your first prediction in 15 minutes
- CLI Reference - Complete command documentation
- BAM Filtering Tutorial - Comprehensive filtering guide
- Performance Optimization - Speed up your analysis
- Troubleshooting - Common issues and solutions
Features
- High Accuracy: Deep learning model trained on real WGA data
- GPU Accelerated: Optimized for CUDA, MPS (Apple Silicon), and CPU
- Easy to Use: Simple CLI with sensible defaults
- Fast Processing: Batch inference with configurable parallelism
- Web Interface: Interactive web UI for visualization and analysis
- Production Ready: Includes filtering, sorting, and indexing of BAM files
Contributing
Contributions are welcome! See our Contributing Guide for development setup and guidelines.
Citation
If you use ChimeraLM in your research, please cite:
@software{chimeralm2025,
title={ChimeraLM: A genomic language model to identify chimera artifacts},
author={Li, Yangyang and Guo, Qingxiang and Yang, Rendong},
year={2025},
url={https://github.com/ylab-hi/ChimeraLM}
}
License
Apache License 2.0 - see LICENSE for details.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file chimeralm-1.0.1.tar.gz.
File metadata
- Download URL: chimeralm-1.0.1.tar.gz
- Upload date:
- Size: 12.6 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8972c3df6d758c8a8e15beb6b36ae951be20ce00a6903d7c77ef943417ee56da
|
|
| MD5 |
872584ac3495915df2a369c1491202af
|
|
| BLAKE2b-256 |
528ab76ba48083e8501d5f926e8448b433201b4895d2ef67fbdb5b24ccb39d23
|
File details
Details for the file chimeralm-1.0.1-py3-none-any.whl.
File metadata
- Download URL: chimeralm-1.0.1-py3-none-any.whl
- Upload date:
- Size: 55.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.9.5
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
c664a03e50b71f301aeef101ed4c3a34262c32d66cae1176181f5d48f619cfda
|
|
| MD5 |
5de6264245d7f00209c749394184b193
|
|
| BLAKE2b-256 |
b264383de353318baf0ce5a2411f11e177d9aa21d43fbe3453d70ec804d638d1
|