Skip to main content

A utility tool to compute shared data statistics for given combinations of data samples

Project description

CombosStat: A utility tool designed to compute shared data statistics for given combinations of data samples

Installation

python3 -m pip install combos_stat

Usage

combos_stat -h

1. stat

Usage: python -m combos_stat.bin.main stat [OPTIONS]

  Calculate statistics for shared data counts based on combinations of samples.

Options:
  -i, --input-file PATH           Path to the input data file  [required]
  -o, --output-file PATH          Path where the results will be saved  [default: output_stat.txt]
  -n, --num-samples INTEGER       Number of samples to compute  [required]
  -t, --share_type [intersection|union]
                                  Type of share to compute
  --header INTEGER                Row number to use as the column names  [default: 0]
  --sep TEXT                      Delimiter to use for reading the input file (e.g., "\t" for tab)
  --start-col INTEGER             Column index to start reading sample data from  [default: 1]
  --show-progress BOOLEAN         Show progress
  --chunksize INTEGER             The chunksize lines to read
  --chunk INTEGER                 The index of chunk
  -h, -?, --help                  Show this message and exit.


examples:
    combos_stat stat -h
    combos_stat stat -i input.txt -o output.txt -n 13 -t intersection
    combos_stat stat -i input.txt -o output.txt -n 13 -t intersection --chunksize 100 --chunk 2 [read 101-200 lines]

2. plot

Usage: python -m combos_stat.bin.main plot [OPTIONS] RESULT_DIR

  Generate Boxplot with statistics results

Options:
  -R, --Rscript TEXT  Path to the executable Rscript  [default: Rscript]
  -w, --write TEXT    Write the R code to a file
  --option TEXT       Options in the format key=value for boxplot, eg. title="Demo Stats", x_lab="Shared_Numbers",
                      y_lab="Data"
  -h, -?, --help      Show this message and exit.



examples:
    combos_stat plot -h
    combos_stat plot out/result
    combos_stat plot out/result --write boxplot.R
    combos_stat plot out/result --write boxplot.R --option x_lab=XXX --option width=30 --option dpi=500

default options:
    infile = 'processed_stats.tsv'
    output = 'boxplot'
    x_lab = 'Genomes'
    y_lab = 'Families'
    title = 'BoxPlot'
    legend_title = 'Type'
    dpi = 300
    width = 14
    height = 7

3. batch

Usage: python -m combos_stat.bin.main batch [OPTIONS]

  Generate batch shells and SJM job

Options:
  -i, --input-file PATH    Path to the input data file
  -sep, --sep TEXT         Delimiter to use for reading the input file (e.g., "\t" for tab)
  -s, --start-col INTEGER  Column index to start reading sample data from  [default: 1]
  -t, --threshold INTEGER  The threshold to divide the combinations  [default: 200000]
  -O, --output-dir PATH    Path to the output directory  [default: .]
  --job TEXT               Generate SJM Job
  --no-check               Do not check queues for SJM
  -h, -?, --help           Show this message and exit.


examples:
    combos_stat batch -h
    combos_stat batch -i input.txt -t 200000 -O out
    combos_stat batch -i input.txt -t 200000 --job run.job

Result

prefix

  • x: intersection
  • y: union

shell directory

shell/
├── plot.sh
├── x2
│   └──stat.x2_1.sh
├── y2
│   └──stat.y2_1.sh
...
├── x14
│   ├── stat.x14_1.sh
│   ├── stat.x14_2.sh
│   ├── ...
│   └── stat.x14_100.sh
...
└── y29
    └──stat.y29_1.sh

result directory

result/
├── x2
│   └──x2_1.txt
├── y2
│   └──y2_1.txt
...
├── x14
│   ├── x14_1.txt
│   ├── x14_2.txt
│   ├── ...
│   └── x14_100.txt
...
└── y29
    └──y29_1.txt

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

combos_stat-1.0.2.tar.gz (12.2 kB view details)

Uploaded Source

Built Distribution

combos_stat-1.0.2-py3-none-any.whl (15.2 kB view details)

Uploaded Python 3

File details

Details for the file combos_stat-1.0.2.tar.gz.

File metadata

  • Download URL: combos_stat-1.0.2.tar.gz
  • Upload date:
  • Size: 12.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.8

File hashes

Hashes for combos_stat-1.0.2.tar.gz
Algorithm Hash digest
SHA256 ed4c287287fc17291d7b4cdecadc14a94de464c046eb5b0f8a04d5587186b57a
MD5 8bafa1f062ee4a5536aa63863a3f6a48
BLAKE2b-256 b1434f356c4cd6259126f9fd3e2fec9fac0c4fd20da6ebd8c20f804634425ac9

See more details on using hashes here.

File details

Details for the file combos_stat-1.0.2-py3-none-any.whl.

File metadata

  • Download URL: combos_stat-1.0.2-py3-none-any.whl
  • Upload date:
  • Size: 15.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.8

File hashes

Hashes for combos_stat-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 3566f10c4016ae9e36629f4eed43796eba62088e32238421d0e9be0da76fd9d2
MD5 12698f8a0dee8244259079b89b7cb711
BLAKE2b-256 02cf2e8788045c750b8b0c4482bcfdc5b5d996cafaaba241109fe4363aa1ed44

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page