Skip to main content

EpistaBase Python SDK and CLI for notebook and platform data access

Project description

EpistaBase Python SDK

epistabase is the governed Python client for the EpistaBase platform. It gives notebooks, scripts, and the epistabase CLI typed, authenticated access to the same projects, experiments, catalog, queries, sequences, and imaging you use in the browser — without ever handling raw storage credentials.

It is a thin client: it wraps the EpistaBase API and nothing more. Data engines, parsers, query planners, and statistics stay on the server or in your own environment.

Naming: you install the distribution epistabase, but the import package is currently biolake (import biolake as bl) and the environment variables are BIOLAKE_*. A full internal rename to epistabase is in progress.

Install

pip install epistabase

Optional extras pull in heavier dependencies only when you need them:

pip install "epistabase[image]"   # numpy/pillow for tiled image reads
pip install "epistabase[cli]"     # the `epistabase` command-line interface

The SDK requires Python 3.12.

Authenticate

The SDK authenticates with a scoped bearer token and an active workspace. There are two ways to get one.

Sign in (CLI)

# Mint a Personal Access Token in the app (Settings → Developer), then:
epistabase auth login --token <PAT> --api-url https://api.example --workspace <id>
epistabase auth status             # shows the resolved API URL, workspace, and token source

Credentials are written to ~/.epistabase/credentials.json and are picked up by the SDK automatically. Access is governed entirely by your EpistaBase account — installing the SDK grants nothing on its own. (Browser/device login is on the roadmap; for now, sign in with a token.)

Token / environment

For headless use (CI, servers, notebook kernels), provide a Personal Access Token and workspace through the environment:

export BIOLAKE_API_URL="https://api.biolake.example"
export BIOLAKE_TOKEN="your-personal-access-token"   # or BIOLAKE_PAT
export BIOLAKE_WORKSPACE_ID="workspace-id"
export BIOLAKE_EXPERIMENT_ID="experiment-id"        # optional context
export BIOLAKE_NOTEBOOK_ID="notebook-id"            # optional context

The SDK never reads AWS, S3, or MinIO credentials. All data access flows through governed BioLake API/data-plane services.

Command-line data access

The same operations are available from the epistabase CLI, sharing the SDK's credentials and governance. Once you have authenticated (above), the verbs run against your active workspace:

epistabase ls --experiment EXP-12        # discover catalog assets   (--json to script)
epistabase get EXP-12/cells.fcs          # show an asset's metadata + lineage
epistabase pull EXP-12/cells.fcs         # download the data         (defaults to ./cells.fcs)
epistabase query "SELECT * FROM counts"  # run governed SQL          (--out result.parquet to save)

Every command takes an opaque asset id or a readable [experiment/]name path (ambiguous names report the candidates). get shows metadata; pull downloads the bytes, or materializes a table to .csv / .parquet.

Quickstart

import biolake as bl

# Query governed lakehouse tables
rows = bl.query("select * from current_table limit 10")

# Discover catalog assets
images = bl.assets(experiment="EXP-2026-0001", kind="IMAGE")
asset = bl.get(images[0].id)

# Promote a notebook output back into the experiment
result = bl.publish_figure(fig, name="Dose response", format="svg")
print(result.download_url)

The implicit session reads the environment above on first use. Pass an explicit Session when you need more than one context in the same process.

Publishing figures and tables

Notebook outputs can be promoted back into the current experiment. The SDK renders the local object, sends bytes to the API, and the API stores the artifact in governed storage:

result = bl.publish_figure(fig, name="Dose response", format="svg")
result.catalog_asset_id
result.download_url

table = bl.publish_table(df, name="Summary table")  # CSV by default

publish_figure() accepts raw bytes, matplotlib figures with savefig(), and plotly figures with to_image(). publish_table() accepts raw bytes, CSV text, or dataframe-like objects with to_csv() / to_parquet(). Published artifacts use BIOLAKE_EXPERIMENT_ID unless experiment_id= is given, and record the source notebook id plus the query log from prior bl.query() calls.

Catalog discovery

bl.assets(...) lists catalog assets visible to your token. Filters are server-side and governed by the same read authorization as the web app:

assets = bl.assets(experiment="EXP-2026-0001", kind="FLOW", tags=["sort-1"])
asset = bl.get(assets[0].id)

asset.kind              # "FLOW", "IMAGE", "TABLE", "VOLUME", ...
asset.format            # "fcs", "tiff", ...
asset.size_bytes
asset.experiment_number
asset.lineage

AssetRef and Asset are descriptors; they never expose standing storage credentials. Type-specific accessors such as biolake.image layer on top.

Image reads

pip install "epistabase[image]"

biolake.image resolves image assets through the catalog, mints short-lived WSI tile sessions, and reads only bounded regions or thumbnails through the tile service:

img = bl.assets(experiment="EXP-2026-0001", kind="IMAGE")[0]

info = bl.image.info(img)
region = bl.image.read_region(img, 0, 0, 2048, 2048, level=2, channels=["DAPI"])
overview = bl.image.thumbnail(img, max_px=1024)
rois = bl.image.annotations(img)            # read-only GeoJSON ROIs

Whole-blob reads

bl.open(asset_id) streams whole raw blobs (FCS, vendor exports, gel TIFFs, CSV/XLSX) via a short-lived presigned GET URL:

with bl.open("asset-id") as f:
    header = f.read(64)

with bl.open("asset-id", byte_range=(0, 1023)) as f:
    first_kb = f.read()

with bl.open("asset-id", download=True) as path:   # for libraries needing a path
    parse_vendor_file(path)

open() is whole-blob only; tiled/pyramidal images go through biolake.image.

Notebook kernels

Inside an EpistaBase notebook kernel the environment is injected for you (a short-lived scoped token plus a refresh URL), so import biolake as bl works with no setup. The same code runs locally once you set the environment above or run epistabase auth login.

Development

This package is developed inside the EpistaBase monorepo under sdk/ but is released independently. From a checkout:

cd sdk
uv run --extra dev pytest -q
uv run --extra dev ruff check .
uv run --extra dev mypy src --strict

See AGENTS.md for the thin-client scope rules and docs/adr/ADR-012 for the packaging and distribution decision.

License

Proprietary. © EpistaBase. Use of the SDK is governed by your EpistaBase agreement.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

epistabase-0.2.0.tar.gz (105.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

epistabase-0.2.0-py3-none-any.whl (53.1 kB view details)

Uploaded Python 3

File details

Details for the file epistabase-0.2.0.tar.gz.

File metadata

  • Download URL: epistabase-0.2.0.tar.gz
  • Upload date:
  • Size: 105.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for epistabase-0.2.0.tar.gz
Algorithm Hash digest
SHA256 906aa11ffd81d78fbaa89a956d42f524fd411115d1f7e674fe5ceee752389119
MD5 7bb78e09f5ac68eb09959b8d53b3942d
BLAKE2b-256 d0bac33539865f8a830c5a003bbb4ee278b80bc8ff257959e7beb4c4d0f18f08

See more details on using hashes here.

Provenance

The following attestation bundles were made for epistabase-0.2.0.tar.gz:

Publisher: sdk-release.yml on McClain-Thiel/BioFlow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file epistabase-0.2.0-py3-none-any.whl.

File metadata

  • Download URL: epistabase-0.2.0-py3-none-any.whl
  • Upload date:
  • Size: 53.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for epistabase-0.2.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f9bc8ecd35d08b4c9903d215668d174792d2e2dcfcdc465e9015577f81390096
MD5 7e8f7d3e90b9833c2bbc46c833df2563
BLAKE2b-256 7935f2acd3edf823436ecfea4ec2f79efadb7c2b0a270abc625ee8d6f49e6754

See more details on using hashes here.

Provenance

The following attestation bundles were made for epistabase-0.2.0-py3-none-any.whl:

Publisher: sdk-release.yml on McClain-Thiel/BioFlow

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page