Tools to quantify GIFT-seq datasets.

These details have not been verified by PyPI

Project links

Project description

GIFT-wrap

This package provides tools for dealing with GIFT-seq data. The typical workflow is to use the all-in-one command giftwrap. Ideally, the WTA panel should be generated beforehand to allow for basic analysis and QC. Additionally, this package provides programmatic utilities for dealing with this data.

For detailed documentation and tutorials, please refer to the documentation website.

Installation

Since the package provides a CLI, it is recommended to install it with pipx or uvx if you do not plan to use the python API, so that the CLI is available on your PATH and does not interfere with your current environment. In this case you can replace pip with pipx or uvx in the following commands.

Recommended installation:

pip install giftwrap[all]

This installs the requirements for spatial data processing and basic summary analyses of paired WTA data.

Alternatively, you can install with fewer dependencies as follows:

pip install giftwrap[analysis]  # For additional basic summary analysis
pip install giftwrap[spatial]  # For spatial counts processing
pip install giftwrap  # Minimal installation

Command Line Usage

Most basic command giftwrap --probes path/to/probes.xlsx --project fastqs/fastq_prefix -o output_dir/

This will process the fastq files in fastqs/ with the prefix fastq_prefix and the probes in path/to/probes.xlsx. The outputs will be saved in output_dir/.

Note that the probes file should be either in .tsv, .csv, or .xlsx format. The following columns can be included:

name (required): By convention: "gene_name HGSVc", example: "TP53 c.215G>A".
lhs_probe (required): The left-hand side probe sequence.
rhs_probe (required): The right-hand side probe sequence.
gap_probe_sequence (optional): The expected gapfill sequence.
original_gap_probe_sequence (optional): If targeting an SNV, the wild-type expected gapfill sequence.
gene (optional): The gene name the probe targets. If not specified, we assume the first word in name is the gene name.

Advanced options

The pipeline can also be run with the following arguments:

--cores: The number of cores to use for processing. Note: VisiumHD has a high complexity cell barcoding scheme, making its counts more memory-bottlenecked than CPU bottlenecked.
-wta: The path to either the filtered_feature_bc_matrix.h5 or the sample_filtered_feature_bc_matrix folder from CellRanger for the WTA panel.
-f: If passed, overwrite files instead of exiting if they already exist.
--technology: The technology. I.e. Flex/VisiumHD. Default is Flex.
--tech_def: Custom technology definition path. See below for more details.
-r1 / -r2: Instead of specifying a fastq prefix to search with --project, provide the specific R1 and R2 files.
--multiplex: If the fastq files are multiplexed, this flag should be set with the number of expected samples.
--barcode: Similar to --multiplex, but only the provided barcode is processed (remaining ignored).

Additionally each step can be run individually with the following commands:

giftwrap-count
giftwrap-correct-umis
giftwrap-correct-gapfill
giftwrap-collect
giftwrap-summarize

Technology Definition

To add a custom technology definition, you must create a python file describing its features by subclassing giftwrap.utils.TechnologyFormatInfo. To get a template, you can output the Flex technology difference with:

giftwrap-generate-tech > tech_def.py

You can then modify the python file as needed. To use the custom technology definition you must pass the following arguments to the giftwrap or giftwrap-count commands:

--technology Custom: Specifies a custom technology defintiion
--tech_def path/to/tech_def.py: The path to your custom technology definition file.

Analyzing processed GIFT-seq data

Files for analysis

The final output file of the pipeline that should be used for analysis is the counts.N.h5 file in the output directory, where N is the plex number (1 by default when the data is not multiplexed). Since gapfills are lower fidelity compared to standard WTA panels, it is highly recommended to run cellranger/spaceranger on the WTA panel, and provide its output to giftwrap with the --wta argument. This will allow giftwrap to filter the counts.N.h5 file to only include valid cells based on the WTA panel called by cellranger/spaceranger. If provided, users should make use of the counts.N.filtered.h5 file instead of the unfiltered version.

Additionally, there will be several generated files in the output directory with statistics which may be used for quality control:

fastq_metrics.tsv: Summary statistics of the parsing of the given fastq files
- TOTAL_READS: Total number of reads processed by the count step.
- PROBE_CONTAINING_READS: Total number of reads that contained a valid probe (including umi/cell barcode).
- POSSIBLE_PROBES: The total number of probes defined.
- PROBES_ENCOUNTERED: The number of probes that were encountered in the fastq files.
- EXACT: The number of reads that contained probes and required no error correction.
- CORRECTED_BARCODE: The number of reads that required cell barcode correction.
- CORRECTED_LHS: The number of reads that required left-hand side probe correction.
- CORRECTED_RHS: The number of reads that required right-hand side probe correction.
- FILTERED_NO_CELL_BARCODE: The number of reads that were filtered out due to no valid cell barcode.
- FILTERED_NO_PROBE_BARCODE: The number of reads that were filtered out due to no valid probe barcode. Only applicable for multiplex runs.
- FILTERED_NO_LHS: The number of reads that were filtered out due to no valid left-hand side probe.
- FILTERED_NO_RHS: The number of reads that were filtered out due to no valid right-hand side probe.
- FILTERED_NO_CONSTANT: The number of reads that were filtered out due to no valid constant sequence region. Only applicable for Flex.
counts.N.summary.csv: Summary statistics about the final (filtered if available) output of the pipeline.
- TOTAL_CELLS: The total number of cells in the output.
- GAPFILL_CONTAINING_CELLS: The number of cells that contained at least one gapfill read.
- UMIS_PER_CELL_MEAN: The mean number of UMIs per cell.
- UMIS_PER_CELL_MEDIAN: The median number of UMIs per cell.
- UMIS_PER_CELL_STD: The standard deviation of the number of UMIs per cell.
- UMIS_PER_CELL_MIN: The minimum number of UMIs per cell.
- UMIS_PER_CELL_MIN_EXCLUDING_ZERO: The minimum number of UMIs per cell excluding cells with zero UMIs.
- UMIS_PER_CELL_MAX: The maximum number of UMIs per cell.
- CELLS_PER_GAPFILL_MEAN: The mean number of cells with gapfills per probe.
- CELLS_PER_GAPFILL_MEDIAN: The median number of cells with gapfills per probe.
- CELLS_PER_GAPFILL_STD: The standard deviation of the number of cells with gapfills per probe.
- CELLS_PER_GAPFILL_MIN: The minimum number of cells with gapfills per probe.
- CELLS_PER_GAPFILL_MAX: The maximum number of cells with gapfills per probe.
counts.N.summary.pdf: Basic analysis of the final (filtered if available) output of the pipeline. The report includes various figures with descriptions inline.

Single-Cell Analysis

Single-cell resolution gapfill data are provided in the counts.N.h5/counts.N.filtered.h5 file. This file is related to the .h5 files typically produced by cellranger. The following concepts are important to understand, however.

Counts data are stored in a CellxFeature sparse matrix format.
Features are NOT single genes/probes. But rather, they represent unique combinations of probes and gapfill product. Information about these are included as metadata.
In addition to pure counts, there are also the total_reads matrix representing the number of reads (including PCR duplicates) supporting a given gapfill and a percent_supporting matrix representing the average % of PCR duplicates that support a gapfill call exactly.

These data can be loaded into standard single-cell analysis tools such as scanpy and Seurat. Since this package is written in python, we provide utilities to deal with these files. But we do provide sample R code to read in the data as well.

Python

The giftwrap module may be directly imported into python to read the data into a scanpy/AnnData object:

import giftwrap as gw

gapfill_adata = gw.read_h5_file("counts.N.h5")

Standard preprocessing and analysis can be done with scanpy as usual. But we additionally provide some specialized functions to make dealing with the data GIFT-seq data easier. Below highlights the main features:

import giftwrap as gw
import scanpy as sc

adata = ...  # Load and pre-process your WTA data here
gapfill_adata = gw.read_h5_file("counts.1.filtered.h5")

# Basic filtering of low-quality genotypes
gw.pp.filter_gapfills(gapfill_adata)

# Ensure the WTA and gapfill data only have the same cells (in the same order)
adata, gapfill_adata = gw.tl.intersect_wta(adata, gapfill_adata)

# Call genotypes from the gapfill data
gw.tl.call_genotypes(gapfill_adata)

# Potentially useful plots (reuqires scanpy)
sc.tl.dendrogram(gapfill_adata, "sample", n_pcs=0)
gw.pl.dendrogram(gapfill_adata, "sample")  # Basic linkage plot across all genotypes per sample

gw.pl.matrixplot(gapfill_adata, "TP53 c.215G>A", "celltype")  # Plot a matrixplot of the genotypes of TP53 c.215G>A

# Plot UMAPs of the genotypes
# Note this assumes that the WTA data has been pre-processed and is ready for sc.pl.umap

gw.tl.transfer_genotypes(adata, gapfill_adata)  # We will be plotting this on the WTA UMAP basis, so transfer genotype info
gw.pl.umap(adata, "TP53 c.215G>A")  # Plot the UMAP colored by the genotype of TP53 c.215G>A

# Note: You can plot a UMAP or t-SNE of the gapfill data directly, but it may be much more noisy.

# Collapse all genotypes into a single probe feature, useful if you don't care about the specific genotypes
gw.tl.collapse_gapfills(gapfill_adata)

R

We package an R script that you may use to read in data. To easily retrieve it, you can run the following command:

giftwrap-generate-r > read_giftwrap.R

This R script requires the Matrix and rhdf5 packages to be installed. All the individual components of the h5 file are read with the read() function. If you have Seurat installed, you can directly read the data into a Seurat object with the read_seurat() function.

Spatial Analysis example

Note: This expects the giftwrap-sc[spatial] extras to be installed. Additionally, we recommend installing spatialdata-io and spatialdata-plot as well.

import giftwrap as gw
import spatialdata as sd
import spatialdata_io as sdio
import spatialdata_plot

wta = sdio.visium_hd(...)  # Load the WTA data
gf = gw.read_h5_file("counts.1.filtered.h5")  # Load the gapfill data

# Join the WTA and gapfill data
wta = gw.sp.join_with_wta(wta, gf)

# Plot the data
wta.pl.

Building

First, install uv: https://github.com/astral-sh/uv

Then set up the environment with: uv sync --all-extras

Build the package with: uv build

Publish with uv uv publish --token <your_token>

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.4.2

Apr 2, 2026

0.4.1

Mar 26, 2026

0.4.0.dev0 pre-release

Mar 3, 2026

0.3.0

Jan 10, 2026

This version

0.2.0

Nov 12, 2025

0.2.0.dev0 pre-release

Nov 12, 2025

0.1.3

Jun 26, 2025

0.1.2

Jun 24, 2025

0.1.1

Jun 20, 2025

0.1.0

Jun 20, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

giftwrap_sc-0.2.0.tar.gz (101.5 MB view details)

Uploaded Nov 12, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

giftwrap_sc-0.2.0-py3-none-any.whl (1.0 MB view details)

Uploaded Nov 12, 2025 Python 3

File details

Details for the file giftwrap_sc-0.2.0.tar.gz.

File metadata

Download URL: giftwrap_sc-0.2.0.tar.gz
Upload date: Nov 12, 2025
Size: 101.5 MB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.5

File hashes

Hashes for giftwrap_sc-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`c57cb03a0d751b6c4c9bfe132e9b347f633364df8a666745848f62a90c5dca83`
MD5	`8ee26f1a9d119b4e29817a559a591952`
BLAKE2b-256	`de2edd99dd9d1760f7540646e561816f094bddcf97eecd326a7d06a33ff79092`

See more details on using hashes here.

File details

Details for the file giftwrap_sc-0.2.0-py3-none-any.whl.

File metadata

Download URL: giftwrap_sc-0.2.0-py3-none-any.whl
Upload date: Nov 12, 2025
Size: 1.0 MB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: uv/0.9.5

File hashes

Hashes for giftwrap_sc-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`aec01004040ac71ef6438e65561800ed48e52067d15ed686cbe88ec410e477f7`
MD5	`cf04b0f1e9d6cf1f94f181e2a2f8d84a`
BLAKE2b-256	`d6d4022dacc083640e43c6216cf9dc4970841161cfdc2e90ef468650668b7dc7`

See more details on using hashes here.

giftwrap-sc 0.2.0

Navigation

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Project description

GIFT-wrap

Installation

Command Line Usage

Advanced options

Technology Definition

Analyzing processed GIFT-seq data

Files for analysis

Single-Cell Analysis

Python

R

Spatial Analysis example

Building

Project details

Verified details

Owner

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes