A tool for managing large datasets
Project description
# expressionable Python Module
The official repository for the expressionable Python module, which allows for:
* Transforming tabular data sets from one format to another.
* Querying large data sets to filter out useful data.
* Selecting additional columns/features to include in the resulting data set.
* Merging data sets of various formats into a single file.
* Gzipping resulting data sets, as well as the ability to read gzipped files.
Click for information on the [ExpressionAble command-line tool](https://github.com/srp33/ShapeShifter-CLI), which combines
the features of ExpressionAble with the ease and speed of the command-line!
Basic use is described below, but see the full documentation on [Read the Docs](https://shapeshifter.readthedocs.io/en/latest/).
## Install
`pip install expressionable`
## Basic Use
After installing, import the ExpressionAble class with `from expressionable import ExpressionAble`. An ExpressionAble object
represents the file to be transformed. It is then transformed using the `export_filter_results` method. Here is a simple
example of file called `input_file.tsv` being transformed into an HDF5 file called `output_file.h5`, while filtering
the data on sex and age:
```python
from expressionable import ExpressionAble
my_expressionable = ExpressionAble("input_file.tsv")
my_expressionable.export_filter_results("output_file.h5", filters="Sex == 'M' and Age > 40")
```
Note that the type of file being read and exported to were not stated explicitly but inferred by ExpressionAble based on
the file extensions provided. If necessary, `input_file_type` and `output_file_type` can be named explicitly.
## Contributing
We welcome contributions that help expand ExpressionAble to be compatible with additional file formats. If you are
interested in contributing, please follow the instructions [here](https://github.com/srp33/ExpressionAble/wiki/Adding-Support-for-Additional-File-Types-in-ExpressionAble).
## Currently Supported Formats
#### Input Formats:
* CSV
* TSV
* JSON
* Excel
* HDF5
* Parquet
* MsgPack
* Stata
* Pickle
* SQLite
* ARFF
* GCT
* GCTX
* PDF
* Kallisto
* GEO
* StarReads
#### Output Formats:
* CSV
* TSV
* JSON
* Excel
* HDF5
* Parquet
* MsgPack
* Stata
* Pickle
* SQLite
* ARFF
* GCT
* RMarkdown
* JupyterNotebook
## Future Formats to Support
We are working hard to expand ExpressionAble to work with even more file formats! Expect the following formats to be
included in future releases:
* Fixed-width files (fwf)
* Genomic Data Commons clinical XML
The official repository for the expressionable Python module, which allows for:
* Transforming tabular data sets from one format to another.
* Querying large data sets to filter out useful data.
* Selecting additional columns/features to include in the resulting data set.
* Merging data sets of various formats into a single file.
* Gzipping resulting data sets, as well as the ability to read gzipped files.
Click for information on the [ExpressionAble command-line tool](https://github.com/srp33/ShapeShifter-CLI), which combines
the features of ExpressionAble with the ease and speed of the command-line!
Basic use is described below, but see the full documentation on [Read the Docs](https://shapeshifter.readthedocs.io/en/latest/).
## Install
`pip install expressionable`
## Basic Use
After installing, import the ExpressionAble class with `from expressionable import ExpressionAble`. An ExpressionAble object
represents the file to be transformed. It is then transformed using the `export_filter_results` method. Here is a simple
example of file called `input_file.tsv` being transformed into an HDF5 file called `output_file.h5`, while filtering
the data on sex and age:
```python
from expressionable import ExpressionAble
my_expressionable = ExpressionAble("input_file.tsv")
my_expressionable.export_filter_results("output_file.h5", filters="Sex == 'M' and Age > 40")
```
Note that the type of file being read and exported to were not stated explicitly but inferred by ExpressionAble based on
the file extensions provided. If necessary, `input_file_type` and `output_file_type` can be named explicitly.
## Contributing
We welcome contributions that help expand ExpressionAble to be compatible with additional file formats. If you are
interested in contributing, please follow the instructions [here](https://github.com/srp33/ExpressionAble/wiki/Adding-Support-for-Additional-File-Types-in-ExpressionAble).
## Currently Supported Formats
#### Input Formats:
* CSV
* TSV
* JSON
* Excel
* HDF5
* Parquet
* MsgPack
* Stata
* Pickle
* SQLite
* ARFF
* GCT
* GCTX
* Kallisto
* GEO
* StarReads
#### Output Formats:
* CSV
* TSV
* JSON
* Excel
* HDF5
* Parquet
* MsgPack
* Stata
* Pickle
* SQLite
* ARFF
* GCT
* RMarkdown
* JupyterNotebook
## Future Formats to Support
We are working hard to expand ExpressionAble to work with even more file formats! Expect the following formats to be
included in future releases:
* Fixed-width files (fwf)
* Genomic Data Commons clinical XML
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
expressionable-1.2.tar.gz
(24.3 kB
view details)
Built Distribution
File details
Details for the file expressionable-1.2.tar.gz
.
File metadata
- Download URL: expressionable-1.2.tar.gz
- Upload date:
- Size: 24.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 258f7b970848328755f9102c6e9ec8dc6874f04068020633f73711fc92f82c3f |
|
MD5 | dbf19a69fdbbc31415adf50aeb69e3a1 |
|
BLAKE2b-256 | 37b011e2c2fb9503f4419021a7b0cf747fc8671f92458187b4ab7645bee1ecf3 |
File details
Details for the file expressionable-1.2-py3-none-any.whl
.
File metadata
- Download URL: expressionable-1.2-py3-none-any.whl
- Upload date:
- Size: 42.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.12.1 pkginfo/1.5.0.1 requests/2.21.0 setuptools/40.8.0 requests-toolbelt/0.8.0 tqdm/4.31.1 CPython/3.7.3
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 4da244b76b621de20e5fba302a0f5e7c7eccc9db3c8586cb3b409fd9f3dd67cd |
|
MD5 | ee9c8f988e4300e00fc4745db39632a7 |
|
BLAKE2b-256 | 62e2b7b0b530491b2042c1fa5fb272eaf3f18cea55717af99db2f3a6284183b3 |