Skip to main content

post process *clip reads after alignment. Replacement for htseq-clip

Project description

GitHub

Shoji

Shoji is a a flexible command-line toolset for the analysis of iCLIP and eCLIP sequencing data. It is designed as a replacement for htseq-clip, providing streamlined workflows for annotation parsing, crosslink site extraction, counting, and matrix generation.

Features

  • Annotation Parsing: Extract and flatten features from GFF3 files to BED format.
  • Sliding Window Generation: Create sliding windows over genomic annotations for downstream analysis.
  • Crosslink Extraction: Extract crosslink sites from BAM files with flexible options for site type, mate, and filtering.
  • Counting: Count crosslink sites per window and output results in Apache Parquet format.
  • Matrix Creation: Aggregate counts across samples into R-friendly matrices (CSV or Parquet).
  • Tabix Conversion: Convert BED files to bgzipped, tabix-indexed format for efficient querying.

Major differences to htseq-clip

  • No --splitExons flag, Shoji cannot split exons into components
  • New --split-intron flag. If an intron overlaps exon from another gene, using this tag will split the intron into non overlapping chunks
  • Piping output disabled. Output file names MUST be specified
  • count function output is only available in Apache parquet format
  • createMatrix by default do not write duplicate windows. If adjacent overlapping windows have same crosslink counts across all samples, this function now writes only the most 5' (relative to strand) window to output file.

Name inspired from: Shoji

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

shoji-0.40.0.tar.gz (38.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

shoji-0.40.0-py3-none-any.whl (46.4 kB view details)

Uploaded Python 3

File details

Details for the file shoji-0.40.0.tar.gz.

File metadata

  • Download URL: shoji-0.40.0.tar.gz
  • Upload date:
  • Size: 38.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.11.0-29-generic

File hashes

Hashes for shoji-0.40.0.tar.gz
Algorithm Hash digest
SHA256 4cbb3b4472eb06d50d55550b62a8743683c29a7758aad224dd779c5e89d82c9a
MD5 1f10d7bd11eae9cd8d4274e537258096
BLAKE2b-256 5c31d7686c8d9958804a6969cf9fcd7272f763f00ee19a52f117f8dfdc16b32c

See more details on using hashes here.

File details

Details for the file shoji-0.40.0-py3-none-any.whl.

File metadata

  • Download URL: shoji-0.40.0-py3-none-any.whl
  • Upload date:
  • Size: 46.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.11.0-29-generic

File hashes

Hashes for shoji-0.40.0-py3-none-any.whl
Algorithm Hash digest
SHA256 50ab24774ecc377f79fd363d9723c562d275d3bbfa6530e254c6b58b1c71a34a
MD5 1ef92cf83a0cccb8dfcbb4f228cb4fd0
BLAKE2b-256 2d7fe3b3647559b45ed7ac27fb66366c9c75e145b899be58c2bfc3695261f99e

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page