post process *clip reads after alignment. Replacement for htseq-clip
Project description
Shoji
Shoji is a a flexible command-line toolset for the analysis of iCLIP and eCLIP sequencing data. It is designed as a replacement for htseq-clip, providing streamlined workflows for annotation parsing, crosslink site extraction, counting, and matrix generation.
Features
- Annotation Parsing: Extract and flatten features from GFF3 files to BED format.
- Sliding Window Generation: Create sliding windows over genomic annotations for downstream analysis.
- Crosslink Extraction: Extract crosslink sites from BAM files with flexible options for site type, mate, and filtering.
- Counting: Count crosslink sites per window and output results in Apache Parquet format.
- Matrix Creation: Aggregate counts across samples into R-friendly matrices (CSV or Parquet).
- Tabix Conversion: Convert BED files to bgzipped, tabix-indexed format for efficient querying.
Major differences to htseq-clip
- No
--splitExonsflag, Shoji cannot split exons into components - New
--split-intronflag. If an intron overlaps exon from another gene, using this tag will split the intron into non overlapping chunks - Piping output disabled. Output file names MUST be specified
countfunction output is only available in Apache parquet formatcreateMatrixby default do not write duplicate windows. If adjacent overlapping windows have same crosslink counts across all samples, this function now writes only the most 5' (relative to strand) window to output file.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file shoji-0.31.0.tar.gz.
File metadata
- Download URL: shoji-0.31.0.tar.gz
- Upload date:
- Size: 38.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.11.0-29-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
89628872a9756553e80b38e17266c8217c6f15816d36f4cd4838199518af04c3
|
|
| MD5 |
1d9c30183a934e80ddd92ef186bcca7b
|
|
| BLAKE2b-256 |
409abb410b513501b8bc48debf6571ad8f2a0d5ff0046c0a6803a27166bb11f0
|
File details
Details for the file shoji-0.31.0-py3-none-any.whl.
File metadata
- Download URL: shoji-0.31.0-py3-none-any.whl
- Upload date:
- Size: 46.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.1.3 CPython/3.12.3 Linux/6.11.0-29-generic
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
628c4f15c373ac3c9554a708f01c899698a8d8ab4b1f6d9d274ec099dc55a766
|
|
| MD5 |
470a71b24ec09de0bd1ed88ce5c66cfa
|
|
| BLAKE2b-256 |
21f3a0b84745f10aee386518ad07be970236db71871054a73d2c59c81a215321
|