Skip to main content

post process *clip reads after alignment. Replacement for htseq-clip

Project description

GitHub

Shoji

Shoji is a a flexible command-line toolset for the analysis of iCLIP and eCLIP sequencing data. It is designed as a replacement for htseq-clip, providing streamlined workflows for annotation parsing, crosslink site extraction, counting, and matrix generation.

Installation

pip install shoji

Features

  • Annotation Parsing: Extract and flatten features from GFF3 files to BED format.
  • Sliding Window Generation: Create sliding windows over genomic annotations for downstream analysis.
  • Crosslink Extraction: Extract crosslink sites from BAM files with flexible options for site type, mate, and filtering.
  • Counting: Count crosslink sites per window and output results in Apache Parquet format.
  • Matrix Creation: Aggregate counts across samples into R-friendly matrices (CSV or Parquet).
  • Tabix Conversion: Convert BED files to bgzipped, tabix-indexed format for efficient querying.

Major differences to htseq-clip

  • No --splitExons flag, Shoji cannot split exons into components
  • New --split-intron flag. If an intron overlaps exon from another gene, using this tag will split the intron into non overlapping chunks
  • Piping output disabled. Output file names MUST be specified
  • count function output is only available in Apache parquet format
  • createMatrix by default do not write duplicate windows. If adjacent overlapping windows have same crosslink counts across all samples, this function now writes only the most 5' (relative to strand) window to output file.

Developed at Hentze lab, EMBL Heidelberg

Name inspired from: Shoji

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shoji-0.44.4.tar.gz (39.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shoji-0.44.4-py3-none-any.whl (47.8 kB view details)

Uploaded Python 3

File details

Details for the file shoji-0.44.4.tar.gz.

File metadata

  • Download URL: shoji-0.44.4.tar.gz
  • Upload date:
  • Size: 39.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for shoji-0.44.4.tar.gz
Algorithm Hash digest
SHA256 2610472121ee3995dad799c48bf26efd6402ba9e04ce5e163d236cec557048b3
MD5 0210d08fa414cc8998834736a600e31b
BLAKE2b-256 b51429e4d6280fe052fdf8fda6496934b9369afdd8f90a519ede0094c75a5ca0

See more details on using hashes here.

Provenance

The following attestation bundles were made for shoji-0.44.4.tar.gz:

Publisher: build_publish.yml on EMBL-Hentze-group/Shoji

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file shoji-0.44.4-py3-none-any.whl.

File metadata

  • Download URL: shoji-0.44.4-py3-none-any.whl
  • Upload date:
  • Size: 47.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for shoji-0.44.4-py3-none-any.whl
Algorithm Hash digest
SHA256 cf64286e464e8ef46e759c5a2023d2584fa0e3d445e8db08eaf6fcc78e6d0b9c
MD5 f35ff30cbc0bd7954a31e7b0462e7c1b
BLAKE2b-256 f5ef844b6ddd2e8fe0800f0965715768ae9ee5df0b5fe63250212a1cee6b1414

See more details on using hashes here.

Provenance

The following attestation bundles were made for shoji-0.44.4-py3-none-any.whl:

Publisher: build_publish.yml on EMBL-Hentze-group/Shoji

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page