Skip to main content

Genomic data processing toolkit

Project description

csplice

Genomic data processing toolkit for converting GTF to BED format and analyzing BAM gene overlaps.

Installation

# Install from PyPI
pip install csplice

# Or install from source
pip install .

Usage

gtf2bed command

Convert GTF to BED files:

csplice gtf2bed -g input.gtf -o output_dir

This will generate 4 BED files in the output directory:

  • gene.bed: Standard 6-column BED format (chrom, start, end, gene_id, gene_name, strand)
  • transcript.bed: Standard 6-column BED format (chrom, start, end, transcript_id, gene_id, strand)
  • exon.bed: Standard 6-column BED format (chrom, start, end, transcript_id, gene_id, strand)
  • intron.bed: Standard 6-column BED format (chrom, start, end, transcript_id, gene_id, strand)

bam2gene command

Analyze BAM file gene overlaps:

csplice bam2gene -b input.bam -g genes.bed -o output_dir

This will generate 2 files in the output directory:

  • splice.txt: TSV format with columns (barcode, umi, gene_id, gene_name), reads with out intron
  • unsplice.txt: TSV format with columns (barcode, umi, gene_id, gene_name), reads with intron

Options

gtf2bed

Options:
  -g, --gtf TEXT         Input GTF file path  [required]
  -o, --outdir TEXT      Output directory  [required]
  -i, --gene_id TEXT     Gene ID key in attributes  [default: gene_id]
  -n, --gene_name TEXT   Gene name key in attributes  [default: gene_name]
  -t, --transcript_id TEXT  Transcript ID key in attributes  [default: transcript_id]
  --help                 Show this message and exit.

bam2gene

Options:
  -b, --bam TEXT         Input BAM file path  [required]
  -o, --outdir TEXT      Output directory  [required]
  -g, --genebed TEXT     Gene BED file path  [required]
  -i, --introned TEXT    Intron BED file path  [required]
  -c, --cb TEXT          Cell barcode tag (default: CB)
  -u, --ub TEXT          UMI tag (default: UB)
  --help                 Show this message and exit.

Development

# Install dev dependencies
poetry install --with dev

# Run tests
poetry run pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csplice-0.1.4.tar.gz (4.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csplice-0.1.4-py3-none-any.whl (6.4 kB view details)

Uploaded Python 3

File details

Details for the file csplice-0.1.4.tar.gz.

File metadata

  • Download URL: csplice-0.1.4.tar.gz
  • Upload date:
  • Size: 4.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for csplice-0.1.4.tar.gz
Algorithm Hash digest
SHA256 7585b657291f59348086304eaec8e3fe937f6d4030e00021176d4c90f44f2173
MD5 49f1d82079814c1989bbe478f3144516
BLAKE2b-256 e42e3e8debe3b1ac265faa7fccf05d0b6d549873411bf496ba337dcc1385c703

See more details on using hashes here.

File details

Details for the file csplice-0.1.4-py3-none-any.whl.

File metadata

  • Download URL: csplice-0.1.4-py3-none-any.whl
  • Upload date:
  • Size: 6.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for csplice-0.1.4-py3-none-any.whl
Algorithm Hash digest
SHA256 caf89a25ad9a6eb3279ac032ffb5ae98bc8e7023f6fc1d1cfd125797ba57f47c
MD5 c84b3f1801c684dc7f884c7c13df97aa
BLAKE2b-256 0ac0ec4b392dc580192fbaf90112ea881a9901eaa7db2f7f6aef06fe86abaafe

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page