Skip to main content

A Python package for single-cell RNA-seq cell type annotation using marker-based scoring and deep learning

Project description

DGscRNA

A Python package for single-cell RNA-seq cell type annotation using marker-based scoring and deep learning refinement.

Overview

DGscRNA combines traditional marker-based cell type scoring with deep learning to resolve ambiguous cell type assignments in single-cell RNA-seq data. The workflow includes:

  1. Preprocessing: Quality control, normalization, and dimensionality reduction
  2. Clustering: Multiple clustering algorithms (Leiden, HDBSCAN, K-means)
  3. Marker Scoring: Density-based scoring using known cell type markers
  4. Deep Learning: Neural network refinement of ambiguous annotations

Installation

pip install dgscrna

Or install from source:

git clone https://github.com/yourusername/DGscRNA.git
cd DGscRNA
pip install -e .

Quick Start

import scanpy as sc
import dgscrna as dg

# Load your data
adata = sc.read_h5ad('your_data.h5ad')

# Run the complete pipeline
results = dg.run_dgscrna_pipeline(
    adata=adata,
    marker_folder='path/to/marker/sets/',
    clustering_methods=['leiden', 'hdbscan'],
    deep_learning=True
)

# View results
sc.pl.umap(adata, color=['leiden', 'CellMarker_Thyroid_mean_DGscRNA'])

Input Data Format

Single-cell Data

  • Format: AnnData object (scanpy/anndata)
  • Requirements: Preprocessed and normalized gene expression matrix

Marker Sets

  • Format: CSV files in a folder
  • Structure: Columns are cell type names, rows are marker genes
  • Example:
,CellType1,CellType2,CellType3
0,Gene1,Gene4,Gene7
1,Gene2,Gene5,Gene8
2,Gene3,Gene6,Gene9

Output

  • AnnData object: With added annotation columns
  • Results dictionary: Training scores and metrics
  • Visualization: UMAP plots with annotations

Documentation

License

GPL-3.0 License - see LICENSE file for details.

Contributing

Contributions are welcome! Please read our contributing guidelines and submit pull requests.

Support

For questions and support, please open an issue on GitHub or contact the maintainers.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dgscrna-1.0.tar.gz (529.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dgscrna-1.0-py3-none-any.whl (30.8 kB view details)

Uploaded Python 3

File details

Details for the file dgscrna-1.0.tar.gz.

File metadata

  • Download URL: dgscrna-1.0.tar.gz
  • Upload date:
  • Size: 529.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dgscrna-1.0.tar.gz
Algorithm Hash digest
SHA256 1520c5f31e1f8a5ad22287b37458fd422bc820816862128c5b8188433502169e
MD5 0f4a02da37ec21512a98c660e4abcaa8
BLAKE2b-256 2aea50fbcee015e5b336db55cddca4b0df675cdd4ac4e0d7687b122a7468189d

See more details on using hashes here.

File details

Details for the file dgscrna-1.0-py3-none-any.whl.

File metadata

  • Download URL: dgscrna-1.0-py3-none-any.whl
  • Upload date:
  • Size: 30.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for dgscrna-1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 2d391fff71b09a91c49080bc95c8b1f7923a754343c5cab0b21909504791d2d4
MD5 45297408a2ca5faebae58ed226230493
BLAKE2b-256 8dbb10568504efe5df1c7e32a17df0035b1563655f891aca7dafa673e5e72247

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page