Parse and read RDS files as Python representations
Project description
rds2py
Parse and construct Python representations for datasets stored in RDS files. It supports a few base classes from R and Bioconductor's SummarizedExperiment
and SingleCellExperiment
S4 classes. This is possible because of Aaron's rds2cpp library.
The package uses memory views (except for strings) to access the same memory from C++ in Python (through Cython ofcourse). This is especially useful for large datasets so we don't make multiple copies of data.
Install
Package is published to PyPI
pip install rds2py
Usage
If you do not have an RDS object handy, feel free to download from single-cell-test-files.
from rds2py import as_SCE, read_rds
rObj = read_rds(<path_to_file>)
Once we have a realized structure of the RDS file, we can now build useful Python representations.
This rObj
contains the realized structure of the RDS file as a Python dict
object, it contains two keys
data
: if atomic entities, contains the numpy view of the memory space.attributes
: additional properties available for the object.
The package provides friendly functions to easily convert few R representations to Python representations.
from rds2py import as_spase_matrix, as_SCE
# to convert an robject to a sparse matrix
sp_mat = as_sparse(rObj)
# to convert an robject to SCE
sce = as_SCE(rObj)
For more use cases converting data.frame
, dgCMatrix
, dgRMatrix
to Python, checkout the documentation.
If you want to add more representations, feel free to send a PR on this repository!
Developer Notes
This project uses Cython to provide bindings from C++ to Python. It tries to use the same memory space (except for strings) instead of making copy of the data.
Steps to setup dependencies -
- git submodules is initialized in
extern/rds2cpp
cmake .
inextern/rds2cpp
directory to download dependencies, especially thebyteme
library
First one needs to build the extern library, this would generate a shared object file to src/rds2py/core-[*].so
python setup.py build_ext --inplace
For typical development workflows, run
python setup.py build_ext --inplace && tox
Note
This project has been set up using PyScaffold 4.3. For details and usage information on PyScaffold see https://pyscaffold.org/.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file rds2py-0.2.0.tar.gz
.
File metadata
- Download URL: rds2py-0.2.0.tar.gz
- Upload date:
- Size: 1.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.16
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | aa9424dd7124d76d229cb13d3cfcba084687bf9675f6e731216d1d21c4167fff |
|
MD5 | 4a538575e8355046c3d76d504922ff7d |
|
BLAKE2b-256 | 9693d56453d02da9ac8026f84a9d2e7b90c578a64452cd9b9279a20262a94911 |