A toolkit for single-cell RNA-seq analysis.
Project description
scAnalyzer: A Single-Cell Analysis Toolkit
A Python toolkit for single-cell RNA sequencing (scRNA-seq) analysis.
🚧 Warning this project is under heavy development and not ready for production. ABI changes can happen frequently until reach stable version 🚧
🚀 Features
- Core Data Structure:
SingleCellDataset(AnnData-like) for efficient handling of sparse matrices and metadata. - Preprocessing: QC metrics, filtering (cells/genes), normalization, log-transformation, and highly variable gene (HVG) selection.
- Dimensionality Reduction: PCA, t-SNE, and UMAP implementations.
- Clustering: Graph-based (Leiden, Louvain), geometric (K-Means, Hierarchical), and density-based (DBSCAN) clustering.
- Differential Expression: Statistical testing (T-test, Wilcoxon) to identify marker genes.
- Visualization: Publication-ready plots (UMAP, t-SNE, Violin, Dotplot, Heatmap).
- I/O: Support for 10x Genomics (
.mtx), H5AD (.h5ad), and CSV formats.
📦 Installation
Clone the repository and install the required dependencies:
git clone [https://github.com/demirbasayyuce/scAnalyzer.git](https://github.com/demirbasayyuce/scAnalyzer.git)
cd sc_analysis
pip install -r requirements.txt
## ⚡ Quick Start
Here is a minimal example of how to run a full analysis pipeline:
```python
import sc_io as io
import preprocessing as pp
import dimensionality as dim
import clustering as cl
import visualization as vis
# 1. Load Data
data = io.read_10x_mtx('./data/pbmc3k/')
# 2. Preprocess
pp.filter_cells(data, min_genes=200, max_pct_mito=5.0)
pp.normalize_total(data)
pp.log1p(data)
pp.highly_variable_genes(data, n_top_genes=2000)
pp.scale(data)
# 3. Embed & Cluster
dim.run_pca(data)
dim.neighbors(data)
dim.run_umap(data)
cl.cluster_leiden(data, resolution=0.5)
# 4. Visualize
vis.plot_umap(data, color='leiden', save='umap_clusters.png')
📂 Project Structure
core.py: Main data structure (SingleCellDataset).preprocessing.py: Filtering, normalization, and scaling functions.dimensionality.py: PCA, Neighborhood Graph, t-SNE, UMAP.clustering.py: Community detection algorithms.differential.py: Marker gene identification.visualization.py: Plotting functions.sc_io.py: Input/Output handlers.utils.py: Helpers for merging and subsampling.
🧪 Running Tests
The project includes a comprehensive suite of unit tests. Run them using:
python -m unittest discover test
📄 License
MIT License.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file scanalysis-0.1.0.tar.gz.
File metadata
- Download URL: scanalysis-0.1.0.tar.gz
- Upload date:
- Size: 32.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
51108efd967bd08f1e20852594dc575f69da09f6adf3c580d6ab6e7577961ea7
|
|
| MD5 |
6c4774325c2da83aab7253a92af98040
|
|
| BLAKE2b-256 |
9ca5fb6c7fec8b0bd8c3abc4a761f9198b2c4226ff4019019bfb9991572ec41a
|
File details
Details for the file scanalysis-0.1.0-py3-none-any.whl.
File metadata
- Download URL: scanalysis-0.1.0-py3-none-any.whl
- Upload date:
- Size: 26.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.19
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
81bbf0184363cb03ff44a6dfb8485c0f791a097fde8ca02cf4dc6690c0e21b12
|
|
| MD5 |
172d06f9aa5a96a8161d15c72845b14e
|
|
| BLAKE2b-256 |
a3a4a9fe70dcabbb4f846e49ce2375cf795fbdd00ce97bca720416507561e33b
|