Ersilia utilities for working with tabular output data
Project description
Manipulating Ersilia's dataframes
eosframes is a library for manipulating inputs and outputs from the Ersilia Model Hub. It splits, assembles, converts, scales, and summarises tabular model output files.
Installation
Python ≥ 3.8 is required.
pip install eosframes
Or from source:
git clone https://github.com/ersilia-os/eosframes.git
cd eosframes
pip install -e .
Quick start
Every file the library reads or writes encodes a model ID and version in its filename, e.g. eos4e40_v1.csv (model eos4e40, version v1).
# Slice a big input CSV into chunks for parallel model runs
eosframes split compounds.csv -o chunks/ --chunksize 10000
# Stitch the per-batch outputs back into one file
eosframes append eos4e40_v1_000.csv eos4e40_v1_001.csv -o eos4e40_v1.csv
# Combine outputs from multiple models, side by side
eosframes stack eos4e40_v1.csv eos7m30_v1.csv -o project_eosmix.csv
Everything the CLI does is also importable:
from eosframes import read_csv, hstack, fit, transform
df = read_csv("eos4e40_v1.csv")
params = fit(df)
scaled = transform(df, params)
Run eosframes --help (or eosframes <command> --help) for inline help.
Commands
| Command | Purpose |
|---|---|
split |
Slice any CSV into chunk files for parallel model runs. |
convert |
CSV ↔ H5, or assemble a chunks folder. |
append |
Vertically concatenate batches from the same model. |
dedupe |
Drop duplicate rows by key. |
stack |
Horizontally combine outputs from different models. |
unstack |
Split a stacked file back into per-model files. |
summary |
Per-feature stats from a local file. |
info |
Model metadata fetched from GitHub. |
columns |
Feature definitions fetched from GitHub. |
fit |
Fit a type-aware robust scaler and save its parameters. |
transform |
Apply a saved scaler to a file. |
See docs/cli.md for every flag, example, and refusal condition.
Documentation
docs/cli.md— every CLI command, all flags, examples, and error patterns.docs/nomenclature.md— every recognised filename / directory pattern, the strict/lenient contract, and the two stack modes.docs/scaling.md— the type-aware robust scaler: column kinds, how each is picked, and quantization / imputation.
About the Ersilia Open Source Initiative
The Ersilia Open Source Initiative is a tech-nonprofit fueling sustainable research in the Global South. Ersilia's main asset is the Ersilia Model Hub, an open-source repository of AI/ML models for drug discovery.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file eosframes-1.1.0.tar.gz.
File metadata
- Download URL: eosframes-1.1.0.tar.gz
- Upload date:
- Size: 51.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.4.1 CPython/3.12.3 Linux/6.17.0-1010-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3c4d371bacdd3d7dc284c8d9c2b1f6d5cedab6b4caef0202db5f9b743621c97d
|
|
| MD5 |
29afe8f255059502bb28e74b4ef0e466
|
|
| BLAKE2b-256 |
ce110e376629be92861e2263a4c74a61e361a59292399a07d18662442122da99
|
File details
Details for the file eosframes-1.1.0-py3-none-any.whl.
File metadata
- Download URL: eosframes-1.1.0-py3-none-any.whl
- Upload date:
- Size: 56.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/2.4.1 CPython/3.12.3 Linux/6.17.0-1010-azure
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
70aa84bf032071d34d8eba19a96cafe4fefa6684e856771cc2cc616bd255fafe
|
|
| MD5 |
eeb03857437d2d9e45a7e1f72fd09035
|
|
| BLAKE2b-256 |
888899f82984c18d2af9abd2e37610e0db119c9a4b86cafc90d667e342d703ce
|