EpistaBase Python SDK and CLI for notebook and platform data access
Project description
EpistaBase Python SDK
epistabase is the governed Python client for the EpistaBase platform. It gives
notebooks, scripts, and the epistabase CLI typed, authenticated access to the
same projects, experiments, catalog, queries, sequences, and imaging you use in
the browser — without ever handling raw storage credentials.
It is a thin client: it wraps the EpistaBase API and nothing more. Data engines, parsers, query planners, and statistics stay on the server or in your own environment.
Naming: you install the distribution
epistabase, but the import package is currentlybiolake(import biolake as bl) and the environment variables areBIOLAKE_*. A full internal rename toepistabaseis in progress.
Install
pip install epistabase
Optional extras pull in heavier dependencies only when you need them:
pip install "epistabase[image]" # numpy/pillow for tiled image reads
pip install "epistabase[cli]" # the `epistabase` command-line interface
The SDK requires Python 3.12.
Authenticate
The SDK authenticates with a scoped bearer token and an active workspace. There are two ways to get one.
Browser login (CLI)
epistabase auth login # opens your browser, stores the token locally
epistabase auth status # shows who you are and the active workspace
Credentials are written to ~/.epistabase/credentials.json. Access is governed
entirely by your EpistaBase account — installing the SDK grants nothing on its own.
Token / environment
For headless use (CI, servers, notebook kernels), provide a Personal Access Token and workspace through the environment:
export BIOLAKE_API_URL="https://api.biolake.example"
export BIOLAKE_TOKEN="your-personal-access-token" # or BIOLAKE_PAT
export BIOLAKE_WORKSPACE_ID="workspace-id"
export BIOLAKE_EXPERIMENT_ID="experiment-id" # optional context
export BIOLAKE_NOTEBOOK_ID="notebook-id" # optional context
The SDK never reads AWS, S3, or MinIO credentials. All data access flows through governed BioLake API/data-plane services.
Command-line data access
The same operations are available from the epistabase CLI, sharing the SDK's
credentials and governance. Once you have authenticated (above), the verbs run
against your active workspace:
epistabase ls --experiment EXP-12 # discover catalog assets (--json to script)
epistabase get EXP-12/cells.fcs # show an asset's metadata + lineage
epistabase pull EXP-12/cells.fcs # download the data (defaults to ./cells.fcs)
epistabase query "SELECT * FROM counts" # run governed SQL (--out result.parquet to save)
Every command takes an opaque asset id or a readable [experiment/]name path
(ambiguous names report the candidates). get shows metadata; pull downloads
the bytes, or materializes a table to .csv / .parquet.
Quickstart
import biolake as bl
# Query governed lakehouse tables
rows = bl.query("select * from current_table limit 10")
# Discover catalog assets
images = bl.assets(experiment="EXP-2026-0001", kind="IMAGE")
asset = bl.get(images[0].id)
# Promote a notebook output back into the experiment
result = bl.publish_figure(fig, name="Dose response", format="svg")
print(result.download_url)
The implicit session reads the environment above on first use. Pass an explicit
Session when you need more than one context in the same process.
Publishing figures and tables
Notebook outputs can be promoted back into the current experiment. The SDK renders the local object, sends bytes to the API, and the API stores the artifact in governed storage:
result = bl.publish_figure(fig, name="Dose response", format="svg")
result.catalog_asset_id
result.download_url
table = bl.publish_table(df, name="Summary table") # CSV by default
publish_figure() accepts raw bytes, matplotlib figures with savefig(), and
plotly figures with to_image(). publish_table() accepts raw bytes, CSV text,
or dataframe-like objects with to_csv() / to_parquet(). Published artifacts use
BIOLAKE_EXPERIMENT_ID unless experiment_id= is given, and record the source
notebook id plus the query log from prior bl.query() calls.
Catalog discovery
bl.assets(...) lists catalog assets visible to your token. Filters are
server-side and governed by the same read authorization as the web app:
assets = bl.assets(experiment="EXP-2026-0001", kind="FLOW", tags=["sort-1"])
asset = bl.get(assets[0].id)
asset.kind # "FLOW", "IMAGE", "TABLE", "VOLUME", ...
asset.format # "fcs", "tiff", ...
asset.size_bytes
asset.experiment_number
asset.lineage
AssetRef and Asset are descriptors; they never expose standing storage
credentials. Type-specific accessors such as biolake.image layer on top.
Image reads
pip install "epistabase[image]"
biolake.image resolves image assets through the catalog, mints short-lived WSI
tile sessions, and reads only bounded regions or thumbnails through the tile
service:
img = bl.assets(experiment="EXP-2026-0001", kind="IMAGE")[0]
info = bl.image.info(img)
region = bl.image.read_region(img, 0, 0, 2048, 2048, level=2, channels=["DAPI"])
overview = bl.image.thumbnail(img, max_px=1024)
rois = bl.image.annotations(img) # read-only GeoJSON ROIs
Whole-blob reads
bl.open(asset_id) streams whole raw blobs (FCS, vendor exports, gel TIFFs,
CSV/XLSX) via a short-lived presigned GET URL:
with bl.open("asset-id") as f:
header = f.read(64)
with bl.open("asset-id", byte_range=(0, 1023)) as f:
first_kb = f.read()
with bl.open("asset-id", download=True) as path: # for libraries needing a path
parse_vendor_file(path)
open() is whole-blob only; tiled/pyramidal images go through biolake.image.
Notebook kernels
Inside an EpistaBase notebook kernel the environment is injected for you (a
short-lived scoped token plus a refresh URL), so import biolake as bl works with
no setup. The same code runs locally once you set the environment above or run
epistabase auth login.
Development
This package is developed inside the EpistaBase monorepo under sdk/ but is
released independently. From a checkout:
cd sdk
uv run --extra dev pytest -q
uv run --extra dev ruff check .
uv run --extra dev mypy src --strict
See AGENTS.md for the thin-client scope rules and
docs/adr/ADR-012
for the packaging and distribution decision.
License
Proprietary. © EpistaBase. Use of the SDK is governed by your EpistaBase agreement.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file epistabase-0.1.0.tar.gz.
File metadata
- Download URL: epistabase-0.1.0.tar.gz
- Upload date:
- Size: 74.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
cb6a4f3c7bc0c45024c3c97f62781b4448300e222ceccdaa740d277775edff63
|
|
| MD5 |
f7d34f0ad96417c01928c77ec95084d6
|
|
| BLAKE2b-256 |
fcaaa3be5638dff2d0021547a46d00f9cb8d08558106b8468c46b572d20bfbbf
|
Provenance
The following attestation bundles were made for epistabase-0.1.0.tar.gz:
Publisher:
sdk-release.yml on McClain-Thiel/BioFlow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
epistabase-0.1.0.tar.gz -
Subject digest:
cb6a4f3c7bc0c45024c3c97f62781b4448300e222ceccdaa740d277775edff63 - Sigstore transparency entry: 1967652669
- Sigstore integration time:
-
Permalink:
McClain-Thiel/BioFlow@f083307730fcfab57e63684c5fc422ad7fffc3e4 -
Branch / Tag:
refs/tags/sdk-v0.1.0 - Owner: https://github.com/McClain-Thiel
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
sdk-release.yml@f083307730fcfab57e63684c5fc422ad7fffc3e4 -
Trigger Event:
push
-
Statement type:
File details
Details for the file epistabase-0.1.0-py3-none-any.whl.
File metadata
- Download URL: epistabase-0.1.0-py3-none-any.whl
- Upload date:
- Size: 48.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6bf7f868d7fc450f2189c85a32a9650fb741d68f1475334851ad64acea3a2e6d
|
|
| MD5 |
a46396700ea0c911f1f67223055382d5
|
|
| BLAKE2b-256 |
bedc4da5eb28ab1d06601096bb23a6d9cac56a86157e6ed3da7f77cb6da35bb7
|
Provenance
The following attestation bundles were made for epistabase-0.1.0-py3-none-any.whl:
Publisher:
sdk-release.yml on McClain-Thiel/BioFlow
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
epistabase-0.1.0-py3-none-any.whl -
Subject digest:
6bf7f868d7fc450f2189c85a32a9650fb741d68f1475334851ad64acea3a2e6d - Sigstore transparency entry: 1967652748
- Sigstore integration time:
-
Permalink:
McClain-Thiel/BioFlow@f083307730fcfab57e63684c5fc422ad7fffc3e4 -
Branch / Tag:
refs/tags/sdk-v0.1.0 - Owner: https://github.com/McClain-Thiel
-
Access:
private
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
sdk-release.yml@f083307730fcfab57e63684c5fc422ad7fffc3e4 -
Trigger Event:
push
-
Statement type: