ST Pipeline: An automated pipeline for spatial mapping of unique transcripts

These details have not been verified by PyPI

Project links

Project description

Spatial Transcriptomics (ST) Pipeline

The ST Pipeline provides the tools, algorithms and scripts needed to process and analyze the raw data generated with Spatial Transcriptomics or Visium in FASTQ format to generate datasets for down-stream analysis.

The ST Pipeline can also be used to process single cell/nuclei RNA-seq data as long as a file with molecular barcodes identifying each cell is provided (same template as the files in the folder "ids").

The ST Pipeline can also be used to process bulk RNA-seq data, in this case the barcodes file is not required.

The ST Pipeline has been optimized for speed, robustness and it is very easy to use with many parameters to adjust all the settings. The ST Pipeline is fully parallel and it has constant memory use. The ST Pipeline allows to skip any of the main steps and provides multiple customization options. The ST Pipeline allows to use either the genome or the transcriptome as reference.

Basically what the ST pipeline does (default mode) is:

Quality trimming step (read 1 and read 2):
- Remove low quality bases
- Sanity check (reads same length, reads order, etc..)
- Check quality UMI
- Remove artifacts (PolyT, PolyA, PolyG, PolyN and PolyC) of user defined length
- Check for AT and GC content
- Discard reads with a minimum number of bases of that failed any of the checks above
Contamimant filter step (e.x. rRNA genome) (Optional)
Mapping with STAR step (only read 2) (Optional)
Demultiplexing with Taggd step (only read 1) (Optional)
Keep reads (read 2) that contain a valid barcode and are correctly mapped
Annotate the reads to the reference (Optional)
Group annotated reads by barcode (spot position), gene and genomic location (with an offset) to get a read count
In the grouping/counting only unique molecules (UMIs) are kept (Optional)

You can see a graphical more detailed description of the workflow in the documents workflow.pdf and workflow_extended.pdf

The output dataset is a matrix of counts (genes as columns, spots as rows) in TSV format. The ST pipeline will also output a log file with useful stats and information.

Installation

For users see install

For developers contributing

Usage

See usage

Authors

See authors

License

The ST pipeline is open source under the MIT license which means that you can use it, change it and re-distribute but you must always refer to our license (see LICENSE).

Credits

If you use the ST Pipeline, please refer its publication: ST Pipeline: An automated pipeline for spatial mapping of unique transcripts Oxford BioInformatics 10.1093/bioinformatics/btx211

Example dataset

You can see a real dataset obtained from the public data from the following publication (http://science.sciencemag.org/content/353/6294/78) in the folder called "data".

Contact

For questions, bugs, feedback, etc.. you can contact:

Jose Fernandez Navarro jc.fernandez.navarro@gmail.com

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

This version

2.0.0

Feb 9, 2025

1.8.1

Jan 13, 2021

1.8.0

Jan 13, 2021

1.7.9

May 27, 2020

1.7.8

May 15, 2020

1.7.6

May 10, 2019

1.7.5

May 10, 2019

1.7.3

May 10, 2019

1.7.2

Oct 11, 2018

1.6.5

Sep 12, 2018

1.6.2

Feb 5, 2018

1.6.0

Jan 30, 2018

1.5.1

Sep 13, 2017

1.5.0

Sep 7, 2017

1.4.5

May 12, 2017

1.4.1

May 3, 2017

1.4.0

Apr 21, 2017

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

stpipeline-2.0.0.tar.gz (46.2 kB view details)

Uploaded Feb 9, 2025 Source

Built Distribution

stpipeline-2.0.0-py3-none-any.whl (56.9 kB view details)

Uploaded Feb 9, 2025 Python 3

File details

Details for the file stpipeline-2.0.0.tar.gz.

File metadata

Download URL: stpipeline-2.0.0.tar.gz
Upload date: Feb 9, 2025
Size: 46.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.10.12 Darwin/23.6.0

File hashes

Hashes for stpipeline-2.0.0.tar.gz
Algorithm	Hash digest
SHA256	`44efbbc5f4fd97cad0b540e8494e6954103327fc14caaea3b5bff5cb75a8927d`
MD5	`ee804e113bea9037dd3a836bd3ca4997`
BLAKE2b-256	`9270c43274e021746d5fcbd25c9ecaa433745b59eb14b62f2ea3892f2482823a`

See more details on using hashes here.

File details

Details for the file stpipeline-2.0.0-py3-none-any.whl.

File metadata

Download URL: stpipeline-2.0.0-py3-none-any.whl
Upload date: Feb 9, 2025
Size: 56.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: poetry/2.0.1 CPython/3.10.12 Darwin/23.6.0

File hashes

Hashes for stpipeline-2.0.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`11a0f6952a7e4a88c9a26fa9907c94a87102b5816331e05bb3050270f26fb97e`
MD5	`362b94765073583e8b505b14313be92b`
BLAKE2b-256	`a8e52cf855801a31a8e6d55077536f77f822842f76a7849df102931063acbe1a`

See more details on using hashes here.

stpipeline 2.0.0

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

Spatial Transcriptomics (ST) Pipeline

Installation

Usage

Authors

License

Credits

Example dataset

Contact

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes