A cross-platform toolkit for building RNA velocity-ready spliced/unspliced matrices

These details have not been verified by PyPI

Project links

Project description

velocity-kit

[![Python 3.8+](htt- --genes-col: Column index in features.tsv to use as gene ID (default: 1 for gene symbols)

-v, --verbose: Increase verbosity level (use -v for info, -vv for debug)

💡 Tip: For comprehensive velocity analysis with QC plots, use the run-scvelo command. See scVelo Analysis Report.

Example

# Method 1: Point to the count directories directly
velocity-kit prep-tenx \
  --total cellranger_introns/outs/raw_feature_bc_matrix \
  --exonic cellranger_standard/outs/raw_feature_bc_matrix \
  --out-loom velocity.loom \
  -v

# Generate analysis report
velocity-kit run-scvelo velocity.loom -o reports/sample1io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)

## Overview

Standard RNA velocity methods expect **spliced** and **unspliced** counts, but many modern single-cell platforms don't directly output these layers. `velocity-kit` provides platform-specific tools to generate velocity-compatible matrices using the **dual-run subtraction method**.

### Supported Platforms

- ✅ **Fluent BioSciences (PIPseq)** - via PIPseeker
- ✅ **10x Genomics** - via CellRanger with `--include-introns`
- 🚧 **Parse Biosciences** - Coming soon  

## Installation

### From PyPI (recommended)

```bash
pip install velocitykit

From source

git clone https://github.com/yourusername/velocitykit.git
cd velocitykit
pip install -e .

Optional dependencies

To run scVelo preprocessing:

pip install velocitykit[scvelo]

For development:

pip install velocity-kit[dev]

Quick Start

PIPseq (PIPseeker)

# Step 1: Generate velocity-compatible matrices
velocity-kit prep-pipseq \
  --total /path/to/pipseeker_total_run \
  --exonic /path/to/pipseeker_exons_only_run \
  --out-loom output.loom

# Step 2: Generate analysis report (optional)
velocity-kit run-scvelo output.loom -o reports/sample1

10x Genomics (CellRanger)

# Step 1: Generate velocity-compatible matrices
velocity-kit prep-tenx \
  --total /path/to/cellranger_with_introns/raw_feature_bc_matrix \
  --exonic /path/to/cellranger_standard/raw_feature_bc_matrix \
  --out-loom output.loom

# Step 2: Generate analysis report (optional)
velocity-kit run-scvelo output.loom -o reports/sample1

Note: You can specify just --out-h5ad or just --out-loom if you only need one format.

Usage

Command Structure

velocity-kit <platform-command> [options]

Available platform commands:

prep-pipseq - Prepare velocity matrices from PIPseeker outputs
prep-tenx - Prepare velocity matrices from 10x Genomics CellRanger outputs
prep-parse - Prepare velocity matrices from Parse Biosciences outputs (coming soon)
prep-scalebio - Prepare velocity matrices from ScaleBio outputs (coming soon)
run-scvelo - Run scVelo analysis and generate comprehensive report from loom file

PIPseq Detailed Usage

Required Arguments

--total: Directory with PIPseeker run that includes introns (total counts)
--exonic: Directory with PIPseeker --exons-only run using the RAW/UNFILTERED count matrix
At least one of:
- --out-h5ad: Output .h5ad file path
- --out-loom: Output .loom file path

Optional Arguments

--genes-col: Column index in features.tsv to use as gene ID (default: 0)
-v, --verbose: Increase verbosity level (use -v for info, -vv for debug)

Example

# Generate velocity-compatible matrices
velocity-kit prep-pipseq \
  --total Analysis/total_run \
  --exonic Analysis/exonic_raw_run \
  --out-loom velocity.loom \
  -v

# Generate comprehensive analysis report
velocity-kit run-scvelo velocity.loom \
  -o reports/sample1 \
  -n Sample1

10x Genomics Detailed Usage

Required Arguments

--total: Directory with CellRanger run using --include-introns flag (or path to raw_feature_bc_matrix)
--exonic: Directory with standard CellRanger run (exons only). Use RAW/UNFILTERED raw_feature_bc_matrix, NOT filtered_feature_bc_matrix
At least one of:
- --out-h5ad: Output .h5ad file path
- --out-loom: Output .loom file path

Optional Arguments

--genes-col: Column index in features.tsv to use as gene ID (default: 1 for gene symbols)
-v, --verbose: Increase verbosity level (use -v for info, -vv for debug)

Example

# Method 1: Point to the count directories directly
velocity-kit prep-tenx \
  --total cellranger_introns/outs/raw_feature_bc_matrix \
  --exonic cellranger_standard/outs/raw_feature_bc_matrix \
  --out-loom velocity.loom \
  -v

# Generate analysis report
velocity-kit analyze velocity.loom -o reports/sample1

# Method 2: Point to the parent directories (will auto-find raw_feature_bc_matrix)
velocity-kit prep-tenx \
  --total cellranger_introns/outs \
  --exonic cellranger_standard/outs \
  --out-loom velocity.loom

How to Generate the Required CellRanger Runs

Standard run (exonic only):

cellranger count --id=sample_exonic \
  --transcriptome=/path/to/refdata \
  --fastqs=/path/to/fastqs \
  --sample=MySample

Run with introns:

cellranger count --id=sample_with_introns \
  --transcriptome=/path/to/refdata \
  --fastqs=/path/to/fastqs \
  --sample=MySample \
  --include-introns

scVelo Analysis Report

Generate a comprehensive HTML report with QC plots, velocity analysis, and visualizations from a loom file.

Required Arguments

loom_path: Path to input .loom file (generated by prep-* commands)

Optional Arguments

-o, --output-dir: Output directory for plots and HTML report (default: scvelo_analysis)
-n, --sample-name: Sample name for report title (default: derived from loom filename)
-v, --verbose: Increase verbosity level (use -v for info, -vv for debug)

Requirements

This command requires scvelo and scanpy to be installed:

pip install scvelo scanpy
# or
pip install velocity-kit[scvelo]

Example

# Generate analysis report from loom file
velocity-kit run-scvelo velocity.loom \
  -o reports/sample1 \
  -n Sample1 \
  -v

# Use default output directory and auto-detect sample name
velocity-kit run-scvelo velocity.loom

Output

The report includes:

QC plots: Total counts, gene counts, spliced/unspliced proportions
Velocity embeddings: UMAP with velocity arrows and stream plots
Top velocity genes: Ranked genes driving velocity patterns
HTML report: All plots combined in an interactive HTML file

Python API

from velocitykit import load_10x_mtx, align_and_union, build_velocity_adata
from pathlib import Path

# Load matrices
X_total, bc_total, g_total = load_10x_mtx(
    Path("total_run/matrix.mtx.gz"),
    Path("total_run/barcodes.tsv.gz"),
    Path("total_run/features.tsv.gz")
)

X_exon, bc_exon, g_exon = load_10x_mtx(
    Path("exonic_run/matrix.mtx.gz"),
    Path("exonic_run/barcodes.tsv.gz"),
    Path("exonic_run/features.tsv.gz")
)

# Align to union of genes and barcodes
X_total_u, X_exon_u, genes_u, bc_u = align_and_union(
    X_total, bc_total, g_total,
    X_exon, bc_exon, g_exon
)

# Build velocity-compatible AnnData
adata = build_velocity_adata(X_total_u, X_exon_u, genes_u, bc_u)

# Save
adata.write_h5ad("output.h5ad")
adata.write_loom("output.loom")

Why Dual-Run Subtraction?

For platforms that use complex molecular counting (MI correction, deduplication, multi-mapping resolution), BAM-based velocity methods can be invalid because these counting transformations don't survive in the BAM file.

The dual-run subtraction approach:

Run your pipeline normally → counts include exonic + intronic molecules
Run with exons-only mode on the raw/unfiltered matrix → spliced-only molecules
Compute: unspliced = total - spliced

This preserves the platform's counting model and produces valid velocity layers.

When to Use Dual-Run Subtraction

✅ PIPseq: Always use dual-run (BAM-based methods are incorrect)
✅ 10x Genomics: Recommended for consistency, especially with CellRanger ≥7.0
⚠️ Other platforms: Evaluate whether platform-specific counting differs from simple read counting

Important Notes

⚠️ For PIPseq: The --exonic directory must point to the RAW/UNFILTERED exons-only run.

Do NOT use a filtered exonic matrix, because the called-cell set may not match the total matrix. This will cause barcode mismatches and incorrect velocity estimates.

Requirements

Python ≥ 3.8
anndata ≥ 0.8.0
h5py ≥ 3.8.0
loompy ≥ 3.0.6
numpy ≥ 1.21.0 (< 2.0.0 to avoid breaking changes)
pandas ≥ 1.3.0
scipy ≥ 1.7.0
tqdm ≥ 4.60.0

Optional:

scvelo ≥ 0.2.4 (for preprocessing)

Note: Python 3.7 support was dropped in v0.2.0. For older Python versions, use velocity-kit v0.1.x.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

MIT License - see LICENSE file for details.

Citation

If you use this tool in your research, please cite:

[Add citation information here]

Contact

For questions or issues, please email ccrsfifx@nih.gov or open an issue on GitHub.

Changelog

v0.1.0 (Initial Release)

PIPseq/PIPseeker support
Modular platform architecture
Python API for custom workflows

Contact

For questions or issues, please:

Email: ccrsfifx@nih.gov
Open an issue on GitHub

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.1

Dec 23, 2025

This version

0.2.0

Dec 23, 2025

0.1.3

Dec 23, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

velocity_kit-0.2.0.tar.gz (38.5 kB view details)

Uploaded Dec 23, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

velocity_kit-0.2.0-py3-none-any.whl (22.0 kB view details)

Uploaded Dec 23, 2025 Python 3

File details

Details for the file velocity_kit-0.2.0.tar.gz.

File metadata

Download URL: velocity_kit-0.2.0.tar.gz
Upload date: Dec 23, 2025
Size: 38.5 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for velocity_kit-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`c819736eef94788237bc81c9d20ec18a2337607354f46c1f95d2a722d6147ae8`
MD5	`2c3c65154f80a8ac34085a88c003a0f8`
BLAKE2b-256	`9d4d83187cfe5f1d370b0f9907939c3334a9611166f5f91d4318c9f48827f2f5`

See more details on using hashes here.

File details

Details for the file velocity_kit-0.2.0-py3-none-any.whl.

File metadata

Download URL: velocity_kit-0.2.0-py3-none-any.whl
Upload date: Dec 23, 2025
Size: 22.0 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.11.14

File hashes

Hashes for velocity_kit-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8142030fb73dc865655ebc2009e15d6d0ec857b5acdf7930c1900664e8d5e908`
MD5	`8f68704592fe1bcb9790f1df41d38e66`
BLAKE2b-256	`7ed611e328aaf80beadc1a3d00b5e340662b5dc9803a8d63f943cdeb7bf2b057`

See more details on using hashes here.

velocity-kit 0.2.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

velocity-kit

Example

From source

Optional dependencies

Quick Start

PIPseq (PIPseeker)

10x Genomics (CellRanger)

Usage

Command Structure

PIPseq Detailed Usage

Required Arguments

Optional Arguments

Example

10x Genomics Detailed Usage

Required Arguments

Optional Arguments

Example

How to Generate the Required CellRanger Runs

scVelo Analysis Report

Required Arguments

Optional Arguments

Requirements

Example

Output

Python API

Why Dual-Run Subtraction?

When to Use Dual-Run Subtraction

Important Notes

Requirements

Contributing

License

Citation

Contact

Changelog

v0.1.0 (Initial Release)

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes