Skip to main content

post process *clip reads after alignment. Replacement for htseq-clip

Project description

GitHub

Shoji

Shoji is a a flexible command-line toolset for the analysis of iCLIP and eCLIP sequencing data. It is designed as a replacement for htseq-clip, providing streamlined workflows for annotation parsing, crosslink site extraction, counting, and matrix generation.

Features

  • Annotation Parsing: Extract and flatten features from GFF3 files to BED format.
  • Sliding Window Generation: Create sliding windows over genomic annotations for downstream analysis.
  • Crosslink Extraction: Extract crosslink sites from BAM files with flexible options for site type, mate, and filtering.
  • Counting: Count crosslink sites per window and output results in Apache Parquet format.
  • Matrix Creation: Aggregate counts across samples into R-friendly matrices (CSV or Parquet).
  • Tabix Conversion: Convert BED files to bgzipped, tabix-indexed format for efficient querying.

Major differences to htseq-clip

  • No --splitExons flag, Shoji cannot split exons into components
  • New --split-intron flag. If an intron overlaps exon from another gene, using this tag will split the intron into non overlapping chunks
  • Piping output disabled. Output file names MUST be specified
  • count function output is only available in Apache parquet format
  • createMatrix by default do not write duplicate windows. If adjacent overlapping windows have same crosslink counts across all samples, this function now writes only the most 5' (relative to strand) window to output file.

Developed at Hentze lab, EMBL Heidelberg

Name inspired from: Shoji

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shoji-0.41.0.tar.gz (39.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shoji-0.41.0-py3-none-any.whl (47.3 kB view details)

Uploaded Python 3

File details

Details for the file shoji-0.41.0.tar.gz.

File metadata

  • Download URL: shoji-0.41.0.tar.gz
  • Upload date:
  • Size: 39.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for shoji-0.41.0.tar.gz
Algorithm Hash digest
SHA256 1522bb2ce032f42b97750c8515a30603006f9d951ead3835fa08689afb2225e8
MD5 b226ca9f77f01a8cd598b15d5c90a6f3
BLAKE2b-256 601f5d2ed9332feeda1a4d25bdbd9c31ae6c7632afa0a549597dda96d045c00f

See more details on using hashes here.

Provenance

The following attestation bundles were made for shoji-0.41.0.tar.gz:

Publisher: build_publish.yml on EMBL-Hentze-group/Shoji

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file shoji-0.41.0-py3-none-any.whl.

File metadata

  • Download URL: shoji-0.41.0-py3-none-any.whl
  • Upload date:
  • Size: 47.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for shoji-0.41.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aa4d7724c1b504b5446f73719a6e232c254e689a9b1d46ac4ee0ab1a9e1365be
MD5 44fee96158dd6b387067814c4f52bf78
BLAKE2b-256 bda1764ccc00272bc05bb235f9b6ec873e7e4d81473e191e0cab870498e709cc

See more details on using hashes here.

Provenance

The following attestation bundles were made for shoji-0.41.0-py3-none-any.whl:

Publisher: build_publish.yml on EMBL-Hentze-group/Shoji

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page