Skip to main content

A Python package for single-cell RNA-seq cell type annotation using marker-based scoring and deep learning

Project description

DGscRNA

A Python package for single-cell RNA-seq cell type annotation using marker-based scoring and deep learning refinement.

Overview

DGscRNA combines traditional marker-based cell type scoring with deep learning to resolve ambiguous cell type assignments in single-cell RNA-seq data. The workflow includes:

  1. Preprocessing: Quality control, normalization, and dimensionality reduction
  2. Clustering: Multiple clustering algorithms (Leiden, HDBSCAN, K-means)
  3. Marker Scoring: Density-based scoring using known cell type markers
  4. Deep Learning: Neural network refinement of ambiguous annotations

Installation

pip install dgscrna

Or install from source:

git clone https://github.com/yourusername/DGscRNA.git
cd DGscRNA
pip install -e .

Quick Start

import scanpy as sc
import dgscrna as dg

# Load your data
adata = sc.read_h5ad('your_data.h5ad')

# Run the complete pipeline
results = dg.run_dgscrna_pipeline(
    adata=adata,
    marker_folder='path/to/marker/sets/',
    clustering_methods=['leiden', 'hdbscan'],
    deep_learning=True
)

# View results
sc.pl.umap(adata, color=['leiden', 'CellMarker_Thyroid_mean_DGscRNA'])

Input Data Format

Single-cell Data

  • Format: AnnData object (scanpy/anndata)
  • Requirements: Preprocessed and normalized gene expression matrix

Marker Sets

  • Format: CSV files in a folder
  • Structure: Columns are cell type names, rows are marker genes
  • Example:
,CellType1,CellType2,CellType3
0,Gene1,Gene4,Gene7
1,Gene2,Gene5,Gene8
2,Gene3,Gene6,Gene9

Output

  • AnnData object: With added annotation columns
  • Results dictionary: Training scores and metrics
  • Visualization: UMAP plots with annotations

Documentation

License

GPL-3.0 License - see LICENSE file for details.

Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests.

Support

For questions and support, please open an issue on GitHub or contact the maintainers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dgscrna-1.1.tar.gz (529.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dgscrna-1.1-py3-none-any.whl (30.8 kB view details)

Uploaded Python 3

File details

Details for the file dgscrna-1.1.tar.gz.

File metadata

  • Download URL: dgscrna-1.1.tar.gz
  • Upload date:
  • Size: 529.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dgscrna-1.1.tar.gz
Algorithm Hash digest
SHA256 5028e225976d6577d9b39399e7fa5158e5a24c4f3abb0b690eb731fb96d885ff
MD5 7f97682776149a192a4fd477c3b48a4c
BLAKE2b-256 68c38b4e9506be44e888130ea0e282f2297f40f851047e61803d89c9a35d4cb9

See more details on using hashes here.

File details

Details for the file dgscrna-1.1-py3-none-any.whl.

File metadata

  • Download URL: dgscrna-1.1-py3-none-any.whl
  • Upload date:
  • Size: 30.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dgscrna-1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 93fa9305c768a76d050f9fc8de7c714916d866c263f536cf5bda77cd9c75969f
MD5 578b03fdc62b9eb11340d1dd0b74ca03
BLAKE2b-256 8f313062357bffc78e39dd4cc1bdcc3a3b7ecd509ef2c62110a58f66ada5a518

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page