Skip to main content

post process *clip reads after alignment. Replacement for htseq-clip

Project description

GitHub

Shoji

Shoji is a a flexible command-line toolset for the analysis of iCLIP and eCLIP sequencing data. It is designed as a replacement for htseq-clip, providing streamlined workflows for annotation parsing, crosslink site extraction, counting, and matrix generation.

Installation

pip install shoji

Features

  • Annotation Parsing: Extract and flatten features from GFF3 files to BED format.
  • Sliding Window Generation: Create sliding windows over genomic annotations for downstream analysis.
  • Crosslink Extraction: Extract crosslink sites from BAM files with flexible options for site type, mate, and filtering.
  • Counting: Count crosslink sites per window and output results in Apache Parquet format.
  • Matrix Creation: Aggregate counts across samples into R-friendly matrices (CSV or Parquet).
  • Tabix Conversion: Convert BED files to bgzipped, tabix-indexed format for efficient querying.

Major differences to htseq-clip

  • No --splitExons flag, Shoji cannot split exons into components
  • New --split-intron flag. If an intron overlaps exon from another gene, using this tag will split the intron into non overlapping chunks
  • Piping output disabled. Output file names MUST be specified
  • count function output is only available in Apache parquet format
  • createMatrix by default do not write duplicate windows. If adjacent overlapping windows have same crosslink counts across all samples, this function now writes only the most 5' (relative to strand) window to output file.

Developed at Hentze lab, EMBL Heidelberg

Name inspired from: Shoji

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shoji-0.44.5.tar.gz (39.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shoji-0.44.5-py3-none-any.whl (47.8 kB view details)

Uploaded Python 3

File details

Details for the file shoji-0.44.5.tar.gz.

File metadata

  • Download URL: shoji-0.44.5.tar.gz
  • Upload date:
  • Size: 39.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for shoji-0.44.5.tar.gz
Algorithm Hash digest
SHA256 8a483c30ea1fc6e54c9940fb8497a0522c86cb262de453467d5bf5041a90cc37
MD5 06bbe4e206f0fbb92f08d03b2d42f0b5
BLAKE2b-256 ac600b19814e498b5d80e7ad7dd8ffba60229db61c922aa729bc58277be820db

See more details on using hashes here.

Provenance

The following attestation bundles were made for shoji-0.44.5.tar.gz:

Publisher: build_publish.yml on EMBL-Hentze-group/Shoji

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file shoji-0.44.5-py3-none-any.whl.

File metadata

  • Download URL: shoji-0.44.5-py3-none-any.whl
  • Upload date:
  • Size: 47.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for shoji-0.44.5-py3-none-any.whl
Algorithm Hash digest
SHA256 d629b175bc1cfc6f4ff6de048bf2d6886cb7bc426d6708366cc6080671e77473
MD5 bada2f12789a99acec0c11e05c71edf8
BLAKE2b-256 1d8034ce590593266fd3540bb8a46d6276d67be9829070420a5bb008403fdcf5

See more details on using hashes here.

Provenance

The following attestation bundles were made for shoji-0.44.5-py3-none-any.whl:

Publisher: build_publish.yml on EMBL-Hentze-group/Shoji

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page