Skip to main content

spaceranger wrangling tools for Oxford Nanopore Technologies' data

Project description

Percula

Percula is a Python package to provide a shim between spatial single-cell data output from Oxford Nanopore Technologies' sequencing devices and 10X Genomics' Space Ranger.

At the time of writing, Space Ranger does not natively support long-read sequencing data from Nanopore devices. Percula provides a way to convert the output of the MinKNOW device software into a format that can be ingested by Space Ranger, primarily in order to obtain cell and UMI barcodes for long-read sequencing data. This information can then be fed into wf-single-cell for long-read single-cell analysis.

Installation

Percula can be obtained as either a conda or pip package. For conda, it can be installed with:

conda create -n percula -c conda-forge -c bioconda -c nanoporetech percula
conda activate percula

Usage

The primary function of Percula is to convert the output of MinKNOW into a format that can be handled by Space Ranger. Its secondary function (because it takes over from other parts of wf-single-cell), is to perform dechimerisation of reads and read trimming.

Running Percula can be done with the following command:

percula preprocess <output> <inputs> ...

where <output> is the path where the output files will be written, and <inputs> are the input files to be processed. The inputs may either be single BAM files, or directories. If directories are provided, they will be searched recursively for BAM files.

See the Onward Processing section below for information on how to use the output files with Space Ranger and wf-single-cell.

For additional support running Percula, please contact Oxford Nanopore Support. It may speed your support request by noting the request is for the attention of the Customer Workflows team.

Fastq Inputs

Although Percula primarily works with BAM files, it can also be used with FASTQ files through the use of fastcat. Fastcat is used to aggregate files whilst preserving metadata information from either the MinKNOW device software, or the dorado basecaller (which write metadata in slightly different ways).

Note: do not use samtools import to aggregate FASTQ files, as metadata may not be preserved correctly when converting to BAM.

To use Percula with FASTQ files, you can run the following command:

fastcat --bam_out --threads 4 --recurse <inputs> ...  \
    | percula preprocess <path_to_output_directory> -

where <inputs> are the input FASTQ files to be processed. Note the - at the end, it indicates that Percula should read from standard input stream. As with percula preprocess, the <inputs> argument to fastcat can be a single FASTQ file, or a directory containing FASTQ files.

Outputs

Three outputs are generated by percula preprocess:

  • configs.json: A JSON file containing adapter configurations found within reads.
  • SAMPLE_S1_L001.bam: A BAM file containing the reads that have been processed.
  • SAMPLE_S1_L001_R[1,2]_001.fastq.gz: a pair of pseudo pair-end FASTQ files containing the reads that have been processed. The first file contains the forward reads, and the second file contains the reverse reads.

The first two files are required for downstream processing with wf-single-cell, while the paired-end read files should be provided to Space Ranger for demultiplexing.

Onward Processing

Having processed the data with Percula, the data can be processed with Space Ranger, and subsequently with wf-single-cell.

Space Ranger processing

The short-read FASTQ output files from Percula can be used with Space Ranger as they would be with any other FASTQ files. For example:

spaceranger count \
    --id <SAMPLE_ID> --slide=<SLIDE_ID> --area=<AREA> \
    --create-bam=true \
    --transcriptome=<TRANSCRIPTOME_REFERENCE> \
    --cytaimage=<VISIUM IMAGE> \
    --fastqs=<PERCULA OUTPUT DIRECTORY>

Please note that the --create-bam=true option is required here: it will produce a BAM file containing the sequencing reads, annotated with spatial barcodes and UMI information. This information is required for downstream processing with wf-single-cell.

The required BAM file will be under the spaceranger ouput directory as:

<SPACE_RANGER_OUTPUT>/outs/possorted_genome_bam.bam 

For further help running Space Ranger, please refer to 10X Genomics' documentation.

wf-single-cell processing

The output from Space Ranger can be combined with the output of Percula to run wf-single-cell.

nextflow run wf-single-cell \
    --bam <PERCULA_OUT>/SAMPLE_S1_L001.bam \
    --spaceranger_bam <SPACE_RANGER_OUTPUT>/outs/possorted_genome_bam.bam \
    --adapter_configs <PERCULA_OUT>/configs.json \
    --kit visium_hd:v1

The --bam argument should point to the BAM file produced by Percula, while the --spaceranger_bam argument should point to the BAM file produced by Space Ranger. The former is the same option that would be used with the workflow in its standard use with other 10X Genomics data. The latter option is particular to the processing of Visium HD data --- it is used to provide the spatial barcodes and UMI information to the workflow causing the workflow to skip its usual read preprocessing and demultiplexing steps. The workflow will still perform full-length isoform specific processing such as long-read alignment and isoform quantification. The --adapter_configs argument should refer to a JSON file produced by Percula; this contains counts of adapter configurations that wf-single-cell uses in the report generation.

See the wf-single-cell documentation for further information on how to run the workflow, or contact Oxford Nanopore Support.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

percula-0.0.5.tar.gz (32.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

percula-0.0.5-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file percula-0.0.5.tar.gz.

File metadata

  • Download URL: percula-0.0.5.tar.gz
  • Upload date:
  • Size: 32.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for percula-0.0.5.tar.gz
Algorithm Hash digest
SHA256 a253f490a3a97ee0e8718f91634748493e38607c261c0a71fddfd673dd1e57ab
MD5 be25ff268e329fcaaf959c9db5841428
BLAKE2b-256 70de5664beb8b96d27473665584fcc0361ce4fbc3121895a445604e100e5284b

See more details on using hashes here.

File details

Details for the file percula-0.0.5-py3-none-any.whl.

File metadata

  • Download URL: percula-0.0.5-py3-none-any.whl
  • Upload date:
  • Size: 31.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.3

File hashes

Hashes for percula-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 441e76fedffa86354ceb2a497022cb1cbbfd35fccf158f370340ed4492513a6e
MD5 f2cb0e2c0fb555f7d5c049082ee38efc
BLAKE2b-256 7e485c25d1bcd7b5481faa1daa131bc7d58739e4ce63c3b90894ab0e9fe22d2a

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page