Skip to main content

A single-cell analysis pipeline.

Project description

scAnalyzer: A Single-Cell Analysis Toolkit

A Python toolkit for single-cell RNA sequencing (scRNA-seq) analysis.

🚧 Warning this project is under heavy development and not ready for production. ABI changes can happen frequently until reach stable version 🚧

GitHub Black isort

Package version

pip install scAnalysis

🚀 Features

  • Core Data Structure: SingleCellDataset (AnnData-like) for efficient handling of sparse matrices and metadata.
  • Preprocessing: QC metrics, filtering (cells/genes), normalization, log-transformation, and highly variable gene (HVG) selection.
  • Dimensionality Reduction: PCA, t-SNE, and UMAP implementations.
  • Clustering: Graph-based (Leiden, Louvain), geometric (K-Means, Hierarchical), and density-based (DBSCAN) clustering.
  • Differential Expression: Statistical testing (T-test, Wilcoxon) to identify marker genes.
  • Visualization: Publication-ready plots (UMAP, t-SNE, Violin, Dotplot, Heatmap).
  • I/O: Support for 10x Genomics (.mtx), H5AD (.h5ad), and CSV formats.

📦 Installation

Clone the repository and install the required dependencies:

git clone [https://github.com/demirbasayyuce/scAnalyzer.git](https://github.com/demirbasayyuce/scAnalyzer.git)
cd sc_analysis
pip install -r requirements.txt

## ⚡ Quick Start

Here is a minimal example of how to run a full analysis pipeline:

```python
import sc_io as io
import preprocessing as pp
import dimensionality as dim
import clustering as cl
import visualization as vis

# 1. Load Data
data = io.read_10x_mtx('./data/pbmc3k/')

# 2. Preprocess
pp.filter_cells(data, min_genes=200, max_pct_mito=5.0)
pp.normalize_total(data)
pp.log1p(data)
pp.highly_variable_genes(data, n_top_genes=2000)
pp.scale(data)

# 3. Embed & Cluster
dim.run_pca(data)
dim.neighbors(data)
dim.run_umap(data)
cl.cluster_leiden(data, resolution=0.5)

# 4. Visualize
vis.plot_umap(data, color='leiden', save='umap_clusters.png')

📂 Project Structure

  • core.py: Main data structure (SingleCellDataset).
  • preprocessing.py: Filtering, normalization, and scaling functions.
  • dimensionality.py: PCA, Neighborhood Graph, t-SNE, UMAP.
  • clustering.py: Community detection algorithms.
  • differential.py: Marker gene identification.
  • visualization.py: Plotting functions.
  • sc_io.py: Input/Output handlers.
  • utils.py: Helpers for merging and subsampling.

🧪 Running Tests

The project includes a comprehensive suite of unit tests. Run them using:

python -m unittest discover test

📄 License

MIT License.


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

scanalysis-0.1.1.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

scanalysis-0.1.1-py3-none-any.whl (3.5 kB view details)

Uploaded Python 3

File details

Details for the file scanalysis-0.1.1.tar.gz.

File metadata

  • Download URL: scanalysis-0.1.1.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for scanalysis-0.1.1.tar.gz
Algorithm Hash digest
SHA256 a6a0b149b783a48c587959a3462f07e4968848789ab936f88a207466baab86fc
MD5 0933603022965919e1ba9f63664b3d2c
BLAKE2b-256 fc86a07036bf093c0eb24598f39eef8df0bf94a37f9696ef723e869ac88ddbe1

See more details on using hashes here.

File details

Details for the file scanalysis-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: scanalysis-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 3.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.19

File hashes

Hashes for scanalysis-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 10a70123a3ba632858c34ad51779137bd9af7e2413e73d710aecf8820378a8c9
MD5 1f66ec70d430ced5bff6cc9fc6fd6508
BLAKE2b-256 ba8a0fc5f85071503e8ffbb7d0d8f92847b0d2ae27b6f0622d349b9710f63d52

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page