Celline - A comprehensive toolkit for single-cell RNA sequencing data analysis
Project description
Celline - Single Cell RNA-seq Analysis Pipeline
Celline is a comprehensive, interactive pipeline for single-cell RNA sequencing (scRNA-seq) analysis, designed to streamline the workflow from raw data to biological insights. It provides both command-line and web-based interfaces for flexible analysis workflows.
📖 Detailed Documentation: Celline Docs
Features
- 🔄 Automated Data Processing: From raw FASTQ files to expression matrices
- ✅ Quality Control: Built-in QC metrics and filtering with Scrublet doublet detection
- 📊 Dimensionality Reduction: PCA, t-SNE, and UMAP implementations
- 🔍 Clustering Analysis: Multiple clustering algorithms
- 🧬 Cell Type Prediction: Automated cell type annotation using scPred
- ⚖️ Batch Effect Correction: Multiple methods for data integration (Seurat, scVI)
- 🌐 Interactive Visualization: Web-based interface for data exploration
- 🔧 Flexible Execution: Support for local multithreading and PBS cluster execution
- 📁 Database Integration: Built-in support for SRA, GEO, and CNCB data repositories
- 🔬 R Integration: Seamless R/Seurat integration for advanced analysis
System Requirements
Required Dependencies
- Python: ≥3.10
- R: ≥4.0 with Seurat and other required packages
- Cell Ranger: For 10x Genomics data processing
- SRA Toolkit: For downloading SRA data (fastq-dump)
Python Dependencies
All Python dependencies are automatically installed via pip. Key packages include:
scanpy- Single-cell analysispandas,polars- Data manipulationfastapi,uvicorn- Web APIrich- Enhanced CLI interfacepysradb- SRA database access
Installation
Option 1: Install from PyPI
pip install celline
Option 2: Install from Source
git clone https://github.com/your-repo/Celline.git
cd Celline
pip install -e .
Option 3: Development Installation
git clone https://github.com/your-repo/Celline.git
cd Celline
pip install -e ".[dev]"
Quick Start
1. Initialize Your Project
Start by initializing a new project. This will validate system dependencies and create configuration files:
celline init
This command will:
- Check for required system dependencies (R, Cell Ranger, SRA Toolkit)
- Set up R environment configuration
- Create project configuration files
- Prompt for project name and settings
2. Configure Execution Settings (Optional)
Configure execution parameters for your system:
# Interactive configuration
celline config
# Or set specific options
celline config --system multithreading --nthread 8
celline config --system PBS --pbs-server your-cluster-name
3. Explore Available Functions
List all available analysis functions:
celline list
Get detailed help for specific functions:
celline help download
celline help preprocess
4. Basic Analysis Workflow
Download Public Data
# Download from SRA/GEO
celline run download --accession GSE123456
celline run download --accession SRR123456
# Download from CNCB
celline run download --accession CRA123456
Data Preprocessing
# Quality control and preprocessing
celline run preprocess --input raw_data/ --output processed/
# Gene expression counting (10x data)
celline run count --input cellranger_output/ --output counts/
Create Seurat Objects
# Create Seurat object for downstream analysis
celline run create_seurat --input counts/ --output seurat_object.rds
Advanced Analysis
# Dimensionality reduction
celline run reduce --input seurat_object.rds --methods pca,umap,tsne
# Cell type prediction
celline run predict_celltype --input seurat_object.rds --reference ref_data/
# Batch effect correction
celline run integrate --input multiple_samples/ --method seurat
5. Interactive Web Interface
Launch the interactive web interface for visual analysis:
celline interactive
This will:
- Start the FastAPI backend server
- Launch the Vue.js frontend
- Open your web browser automatically
- Provide interactive data exploration tools
6. API Server Only (for Development)
Start only the API server for testing:
celline api
Available Functions
| Function | Description | Usage Example |
|---|---|---|
init |
Initialize project and validate dependencies | celline init |
download |
Download scRNA-seq data from public repositories | celline run download --accession GSE123456 |
preprocess |
Quality control and preprocessing | celline run preprocess |
count |
Gene expression quantification | celline run count |
create_seurat |
Create Seurat objects | celline run create_seurat |
reduce |
Dimensionality reduction (PCA, UMAP, t-SNE) | celline run reduce |
integrate |
Batch effect correction and data integration | celline run integrate |
predict_celltype |
Automated cell type annotation | celline run predict_celltype |
batch_cor |
Batch correlation analysis | celline run batch_cor |
interactive |
Launch web interface | celline interactive |
sync_DB |
Update local databases | celline run sync_DB |
info |
Show system information | celline info |
Project Structure
When you initialize a project, Celline creates the following structure:
your_project/
├── setting.toml # Project configuration
├── data/ # Raw and processed data
├── results/ # Analysis results
├── scripts/ # Generated analysis scripts
└── logs/ # Execution logs
Configuration
Celline uses setting.toml files for configuration:
[project]
name = "my_project"
version = "0.01"
[execution]
system = "multithreading" # or "PBS"
nthread = 8
pbs_server = "your-cluster" # for PBS system
[R]
r_path = "/usr/local/bin/R"
[fetch]
wait_time = 4 # seconds between API calls
Advanced Usage
Running on HPC Clusters
For PBS/Torque clusters:
celline config --system PBS --pbs-server your-cluster-name
celline run preprocess # Will submit PBS jobs automatically
Custom Analysis Scripts
Celline generates executable scripts in the scripts/ directory that can be run independently or modified for custom workflows.
R Integration
Access Seurat objects and run custom R analysis:
# R scripts are available in template/hook/R/
# Custom R functions can be added to the pipeline
Troubleshooting
Common Issues
- Missing Dependencies: Run
celline initto validate all dependencies - R Package Issues: Ensure Seurat and required R packages are installed
- Memory Issues: Adjust thread count with
celline config --nthread <number> - Web Interface Not Loading: Check that ports 8000 and 3000 are available
Getting Help
# General help
celline help
# Function-specific help
celline help <function_name>
# System information
celline info
# List all functions
celline list
Contributing
We welcome contributions! Please see our contributing guidelines for more information.
Citation
If you use Celline in your research, please cite:
[Citation information to be added]
License
This project is licensed under the MIT License - see the LICENSE file for details.
Support
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file celline-1.0.2.tar.gz.
File metadata
- Download URL: celline-1.0.2.tar.gz
- Upload date:
- Size: 654.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
041e42a00f9e1f652e91508273916617aea2e55ec5dabe63c948c458b77d1243
|
|
| MD5 |
4a9b61434474bf6deacfda4c1a254588
|
|
| BLAKE2b-256 |
97f8f5a42dbb68736cbdbaa4affb5ca823ef151620b8a620ba4787908c23c00c
|
Provenance
The following attestation bundles were made for celline-1.0.2.tar.gz:
Publisher:
publish-testpypi.yml on Kataoka-K-Lab/Celline
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
celline-1.0.2.tar.gz -
Subject digest:
041e42a00f9e1f652e91508273916617aea2e55ec5dabe63c948c458b77d1243 - Sigstore transparency entry: 370852723
- Sigstore integration time:
-
Permalink:
Kataoka-K-Lab/Celline@5ac14327bc256e616472b9570244ee9c7e6ffd9c -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Kataoka-K-Lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-testpypi.yml@5ac14327bc256e616472b9570244ee9c7e6ffd9c -
Trigger Event:
push
-
Statement type:
File details
Details for the file celline-1.0.2-py3-none-any.whl.
File metadata
- Download URL: celline-1.0.2-py3-none-any.whl
- Upload date:
- Size: 716.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.12.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b7828e9529e0aa9098080484e737e1273f1a8de4a8631b8567c91dd3b94f9bbe
|
|
| MD5 |
7ac9dd4035ea86b62601a6e0cbe35018
|
|
| BLAKE2b-256 |
6de67d8fcbac8ce6500fc15ae349fba0fdae7b4fe81cf62e89f8d37b50dbbe64
|
Provenance
The following attestation bundles were made for celline-1.0.2-py3-none-any.whl:
Publisher:
publish-testpypi.yml on Kataoka-K-Lab/Celline
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
celline-1.0.2-py3-none-any.whl -
Subject digest:
b7828e9529e0aa9098080484e737e1273f1a8de4a8631b8567c91dd3b94f9bbe - Sigstore transparency entry: 370852745
- Sigstore integration time:
-
Permalink:
Kataoka-K-Lab/Celline@5ac14327bc256e616472b9570244ee9c7e6ffd9c -
Branch / Tag:
refs/heads/main - Owner: https://github.com/Kataoka-K-Lab
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
publish-testpypi.yml@5ac14327bc256e616472b9570244ee9c7e6ffd9c -
Trigger Event:
push
-
Statement type: