Skip to main content

Genomic data processing toolkit

Project description

csplice

Genomic data processing toolkit for converting GTF to BED format and analyzing BAM gene overlaps.

Installation

# Install from PyPI
pip install csplice

# Or install from source
pip install .

Usage

gtf2bed command

Convert GTF to BED files:

csplice gtf2bed -g input.gtf -o output_dir

This will generate 4 BED files in the output directory:

  • gene.bed: Standard 6-column BED format (chrom, start, end, gene_id, gene_name, strand)
  • transcript.bed: Standard 6-column BED format (chrom, start, end, transcript_id, gene_id, strand)
  • exon.bed: Standard 6-column BED format (chrom, start, end, transcript_id, gene_id, strand)
  • intron.bed: Standard 6-column BED format (chrom, start, end, transcript_id, gene_id, strand)

bam2gene command

Analyze BAM file gene overlaps:

csplice bam2gene -b input.bam -g genes.bed -o output_dir

This will generate 2 files in the output directory:

  • splice.txt: TSV format with columns (barcode, umi, gene_id, gene_name), reads with out intron
  • unsplice.txt: TSV format with columns (barcode, umi, gene_id, gene_name), reads with intron

Options

gtf2bed

Options:
  -g, --gtf TEXT         Input GTF file path  [required]
  -o, --outdir TEXT      Output directory  [required]
  -i, --gene_id TEXT     Gene ID key in attributes  [default: gene_id]
  -n, --gene_name TEXT   Gene name key in attributes  [default: gene_name]
  -t, --transcript_id TEXT  Transcript ID key in attributes  [default: transcript_id]
  --help                 Show this message and exit.

bam2gene

Options:
  -b, --bam TEXT         Input BAM file path  [required]
  -o, --outdir TEXT      Output directory  [required]
  -g, --genebed TEXT     Gene BED file path  [required]
  -i, --introned TEXT    Intron BED file path  [required]
  -c, --cb TEXT          Cell barcode tag (default: CB)
  -u, --ub TEXT          UMI tag (default: UB)
  --help                 Show this message and exit.

Development

# Install dev dependencies
poetry install --with dev

# Run tests
poetry run pytest

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

csplice-0.2.0.tar.gz (5.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

csplice-0.2.0-py3-none-any.whl (6.9 kB view details)

Uploaded Python 3

File details

Details for the file csplice-0.2.0.tar.gz.

File metadata

  • Download URL: csplice-0.2.0.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for csplice-0.2.0.tar.gz
Algorithm Hash digest
SHA256 5639d02324f0b83b7c52b3df00463b370f943b5cbc1a853322fe4c97aecfb508
MD5 6d0aff6882ff6b413d3dc758fa40dec4
BLAKE2b-256 25287a598b4b5fa4a25eb3b07438d2e75e9c908a9b9d8d00b183aa85f52e93bd

See more details on using hashes here.

File details

Details for the file csplice-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: csplice-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 6.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.9

File hashes

Hashes for csplice-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 aba44e2e6ef8c4b5d65977f2acf2dae0a02790377473b9142fcc6bffe92d8dbf
MD5 e54984be5a246337632c48aa8a160e1f
BLAKE2b-256 56b357379a0a64c43805c33f42e5db30ea4b7ebb196025fdb304aa7f72ddd96f

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page