Process mutation data into standard formats originally developed for the ExploSig family of tools
Project description
ExploSig Data
Helpers for processing mutation data into standard formats originally developed for the ExploSig family of tools.
Installation
pip install explosig-data
Example
With raw SSM/MAF file from ICGC or TCGA:
>>> import explosig_data as ed
>>> # Step 1: Process into the ExploSig "standard format":
>>> data_container = ed.standardize_ICGC_ssm_file('path/to/ssm.tsv') # if ICGC
>>> data_container = ed.standardize_TCGA_maf_file('path/to/maf.tsv') # if TCGA
>>> # Step 2: Process further
>>> data_container.extend_df().to_counts_df('SBS_96', ed.categories.SBS_96_category_list())
>>> # Step 3: Access any processed dataframe of interest:
>>> ssm_df = data_container.ssm_df
>>> extended_df = data_container.extended_df
>>> counts_df = data_container.counts_dfs['SBS_96']
>>> # Alternatively, use without the chaining API:
>>> ssm_df = ed.standardize_ICGC_ssm_file('path/to/ssm.tsv', wrap=False) # if ICGC
>>> ssm_df = ed.standardize_TCGA_maf_file('path/to/maf.tsv', wrap=False) # if TCGA
>>> extended_df = ed.extend_ssm_df(ssm_df)
>>> counts_df = ed.counts_from_extended_ssm_df(
extended_df,
category_colname='SBS_96',
category_values=ed.categories.SBS_96_category_list()
)
With data already in the ExploSig "standard format":
>>> import explosig_data as ed
>>> import pandas as pd
>>> # Step 0: Load the data into a dataframe, for example by reading from a TSV file.
>>> ssm_df = pd.read_csv('path/to/standard.tsv', sep='\t')
>>> # Step 1: Wrap the dataframe using the container class to allow use of the chainable functions.
>>> data_container = ed.SimpleSomaticMutationContainer(ssm_df)
>>> # Now see step 2 above (or the alternative steps above).
Development
Install for development (in editable mode):
pip install -e .
Build and push to PyPI:
python setup.py sdist bdist_wheel
python -m twine upload dist/*
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file explosig-data-0.0.5.tar.gz.
File metadata
- Download URL: explosig-data-0.0.5.tar.gz
- Upload date:
- Size: 15.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ea642e35db841f48011aaa54130a0037bb254117d5d16e60da4457cc2a077298
|
|
| MD5 |
83e5b59b2d7e3df7a75b2664d86e4949
|
|
| BLAKE2b-256 |
26cf1da9134ec562eff369bd11c20ae922de6bf9150370892fad727957fee717
|
File details
Details for the file explosig_data-0.0.5-py3-none-any.whl.
File metadata
- Download URL: explosig_data-0.0.5-py3-none-any.whl
- Upload date:
- Size: 21.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.39.0 CPython/3.6.9
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b205fad3c12853f383a81c2cf2f26b93cf5d7dc92087dfe3f178b43e517f9098
|
|
| MD5 |
f71c09dd7d993649fbe4ed37a3ca80a3
|
|
| BLAKE2b-256 |
a332d87e29265eba0529bc2304e73068f55a70cf29cfb720304da535dd121b7c
|