Skip to main content

post process *clip reads after alignment. Replacement for htseq-clip

Project description

GitHub

Shoji

Shoji is a a flexible command-line toolset for the analysis of iCLIP and eCLIP sequencing data. It is designed as a replacement for htseq-clip, providing streamlined workflows for annotation parsing, crosslink site extraction, counting, and matrix generation.

Installation

pip install shoji

Features

  • Annotation Parsing: Extract and flatten features from GFF3 files to BED format.
  • Sliding Window Generation: Create sliding windows over genomic annotations for downstream analysis.
  • Crosslink Extraction: Extract crosslink sites from BAM files with flexible options for site type, mate, and filtering.
  • Counting: Count crosslink sites per window and output results in Apache Parquet format.
  • Matrix Creation: Aggregate counts across samples into R-friendly matrices (CSV or Parquet).
  • Tabix Conversion: Convert BED files to bgzipped, tabix-indexed format for efficient querying.

Major differences to htseq-clip

  • No --splitExons flag, Shoji cannot split exons into components
  • New --split-intron flag. If an intron overlaps exon from another gene, using this tag will split the intron into non overlapping chunks
  • Piping output disabled. Output file names MUST be specified
  • count function output is only available in Apache parquet format
  • createMatrix by default do not write duplicate windows. If adjacent overlapping windows have same crosslink counts across all samples, this function now writes only the most 5' (relative to strand) window to output file.

Developed at Hentze lab, EMBL Heidelberg

Name inspired from: Shoji

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shoji-0.44.2.tar.gz (39.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shoji-0.44.2-py3-none-any.whl (47.8 kB view details)

Uploaded Python 3

File details

Details for the file shoji-0.44.2.tar.gz.

File metadata

  • Download URL: shoji-0.44.2.tar.gz
  • Upload date:
  • Size: 39.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for shoji-0.44.2.tar.gz
Algorithm Hash digest
SHA256 193673fba441398c12549f36ef6a874921e3fe1cc5b1a3386dcb29de003fe8a3
MD5 01874453dfa8f935f0a1d3723b543e07
BLAKE2b-256 3f2e34573aa8ec410fc3285e5d8327405aac2e1a376b59a791702f49c8a76a65

See more details on using hashes here.

Provenance

The following attestation bundles were made for shoji-0.44.2.tar.gz:

Publisher: build_publish.yml on EMBL-Hentze-group/Shoji

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file shoji-0.44.2-py3-none-any.whl.

File metadata

  • Download URL: shoji-0.44.2-py3-none-any.whl
  • Upload date:
  • Size: 47.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for shoji-0.44.2-py3-none-any.whl
Algorithm Hash digest
SHA256 81ccd7b40c1050229ec1d1b5d991212b9961bb1be17f06613006d061578bf84a
MD5 0a2d0a75a0b5f03cfbfc84deabd346ae
BLAKE2b-256 f2900daef3d25326f040a732b2562d07261e7559e1d998856ef005e404b3506e

See more details on using hashes here.

Provenance

The following attestation bundles were made for shoji-0.44.2-py3-none-any.whl:

Publisher: build_publish.yml on EMBL-Hentze-group/Shoji

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page