Skip to main content

Open data and models for astrocyte dynamics

Project description

astrocytes

๐Ÿ’ซ OpenAstrocytes: Open data and models for astrocyte dynamics

A Python library for discovering, loading, and processing experimental imaging datasets from astrocyte neuroscience research using cloud-hosted data infrastructure.

โ€”โค๏ธโ€๐Ÿ”ฅ Forecast

Python 3.12+

Features

  • Unified Data Discovery: Access experimental datasets through a single Hive interface backed by cloud-hosted manifests
  • Type-Safe Schemas: Strongly-typed dataclasses for different experiment types (bath application, photochemical uncaging)
  • Lens Transformations: Composable data pipelines for converting raw frames to typed experiments
  • atdata + WebDataset Format: Streaming-friendly, schematized TAR archives for efficient cloud storage and access

To see OpenAstrocytes in action, check out the demo in our release pub.

Installation

# Install the core package
pip install astrocytes

# Or with uv (recommended for development)
uv pip install astrocytes

Requirements: Python 3.12 or 3.13

Quick Start

import astrocytes

# Access the data repository
hive = astrocytes.Hive()

# Load a dataset via shortcuts
dataset = astrocytes.data.bath_application

# Iterate through frames
for frame in dataset.ordered(batch_size=None):
    print(f"Frame at t={frame.t:.1f}s, compound={frame.applied_compound}")
    # frame.image is a numpy array of raw 2P imaging data

Architecture

Three-Tier Data Organization

The library organizes imaging data in three tiers:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Tier 1: Generic (toile.Frame)                 โ”‚
โ”‚  Raw imaging data with minimal structure       โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                  โ”‚ Lens Transformation
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Tier 2: Typed Experiments                     โ”‚
โ”‚  BathApplicationFrame, UncagingFrame, etc.     โ”‚
โ”‚  Domain-specific metadata extracted            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                  โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Tier 3: Derived Results (Pre-computed)        โ”‚
โ”‚  EmbeddingResult, EmbeddingPCResult            โ”‚
โ”‚  Vision transformer outputs, PCA projections   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

The Hive Pattern

The Hive class serves as the main entry point, fetching a YAML manifest from the cloud and organizing datasets hierarchically:

hive = astrocytes.Hive()  # Fetches default manifest from data.forecastbio.cloud

# Navigate the hierarchy
generic_frames = hive.index.generic.bath_application.dataset
embeddings = hive.index.embeddings.bath_application.dataset  # Pre-computed embeddings
pca_reduced = hive.index.patch_pcs.bath_application.dataset  # Pre-computed PCA projections

Usage Examples

Working with Typed Experiments

Convert generic frames to experiment-specific types using lens transformations:

import astrocytes
from astrocytes.schema import BathApplicationFrame

# Load generic frames
generic_dataset = astrocytes.data.bath_application

# Apply lens transformation to get typed frames
typed_dataset = generic_dataset.as_type(BathApplicationFrame)

# Now iterate with full type information
for frame in typed_dataset.ordered(batch_size=None):
    print(f"Compound: {frame.applied_compound}")
    print(f"Time: {frame.t:.2f}s (intervention at {frame.t_intervention}s)")
    print(f"Mouse: {frame.mouse_id}, Slice: {frame.slice_id}")
    print(f"Image shape: {frame.image.shape}")
    print(f"Pixel scale: {frame.scale_x}ฮผm ร— {frame.scale_y}ฮผm")

Working with Pre-computed Embeddings

The data repository includes pre-computed vision transformer embeddings and PCA projections. You can access these directly or apply custom transformations:

from astrocytes import data

# Access pre-computed embeddings
embeddings = data.bath_application_embeddings
for result in embeddings.ordered(batch_size=None):
    print(f"CLS embedding shape: {result.cls_embedding.shape}")
    print(f"Patch embeddings shape: {result.patches.shape}")  # (h, w, embedding_dim)
    break

# Access pre-computed PCA projections
pca_results = data.bath_application_patch_pcs
for result in pca_results.ordered(batch_size=None):
    print(f"Patch PCs shape: {result.patch_pcs.shape}")  # (h, w, n_components)
    break

Experiment Types

Bath Application

Experiments where compounds are applied to the bath solution:

from astrocytes.schema import BathApplicationFrame, BathApplicationCompound

# Compounds: 'baclofen', 'tacpd', 'unknown'
for frame in typed_dataset.ordered(batch_size=None):
    if frame.applied_compound == 'baclofen':
        # Analyze GABA_B receptor activation
        pass
    # ...

Photochemical Uncaging

Experiments using two-photon photo-uncaging to release caged neurotransmitters:

from astrocytes.schema import UncagingFrame

dataset = astrocytes.data.uncaging
typed = dataset.map(UncagingFrame.from_generic)

# Compounds: 'gaba', 'glu', 'laser_only', 'unknown'
for frame in typed.ordered(batch_size=None):
    if frame.uncaged_compound == 'glu':
        # Analyze glutamate uncaging response
        pass
    # ...

Dataset Shortcuts

For convenience, common dataset combinations are available directly:

import astrocytes

# Generic datasets (toile.Frame)
astrocytes.data.bath_application
astrocytes.data.uncaging

# Derived datasets (processed)
astrocytes.data.bath_application_embeddings   # EmbeddingResult
astrocytes.data.bath_application_patch_pcs    # EmbeddingPCResult

Development Setup

# Clone the repository
git clone https://github.com/forecast-bio/open-astrocytes.git
cd open-astrocytes

# Install with development dependencies using uv
uv sync --locked --all-extras --dev

# Run tests
uv run pytest

# Run tests with coverage
uv run pytest --cov=astrocytes --cov-report=html

Project Structure

open-astrocytes/
โ”œโ”€โ”€ src/astrocytes/
โ”‚   โ”œโ”€โ”€ __init__.py              # Main package entry point
โ”‚   โ”œโ”€โ”€ schema.py                # Public schema API
โ”‚   โ””โ”€โ”€ _datasets/               # Dataset management
โ”‚       โ”œโ”€โ”€ __init__.py          # Hive and DatasetIndex
โ”‚       โ”œโ”€โ”€ _common.py           # Base classes
โ”‚       โ”œโ”€โ”€ _bath_application.py # Bath application schema
โ”‚       โ”œโ”€โ”€ _uncaging.py         # Uncaging schema
โ”‚       โ”œโ”€โ”€ _embeddings.py       # Embedding schemas
โ”‚       โ””โ”€โ”€ _future.py           # Future expansions
โ”œโ”€โ”€ tests/                       # Test suite
โ”œโ”€โ”€ pyproject.toml               # Project metadata
โ””โ”€โ”€ README.md                    # This file

Key Dependencies

  • atdata: Core dataset abstraction and lens transformations
  • toile: Generic imaging frame schema
  • matplotlib: Plotting and visualization
  • scikit-image: Image processing utilities
  • scipy: Scientific computing tools

Data Repository

The default data repository is hosted at:

https://data.forecastbio.cloud/open-astrocytes/

The manifest is automatically fetched when you create a Hive() instance. You can specify a custom repository location to use a separate, cloned instance:

hive = astrocytes.Hive(root='https://my-custom-repo.com/astrocytes')

Contributing

Contributions are welcome! To add a new experiment type:

  1. Create a new schema module in src/astrocytes/_datasets/_your_experiment.py
  2. Define a typed frame class inheriting from ExperimentFrame
  3. Implement the from_generic() lens transformation
  4. Add the dataset to DatasetIndex in _datasets/__init__.py
  5. Export types in schema.py
  6. Add tests in tests/test_datasets.py

See CLAUDE.md for detailed development guidelines.

Citation

If you use this library in your research, and please cite:

@article{levesque2025openastrocytes,
  author = {Maxine Levesque and Kira Poskanzer},
  title = {OpenAstrocytes},
  journal = {Forecast Research},
  year = {2025},
  note = {https://forecast.bio/research/open-astrocytes/},
}

License

This project is licensed under the Mozilla Public License 2.0 - see the LICENSE.md file for details.

Acknowledgments

Developed by the Open Science team Forecast.

Docs and README largely by Claude. If they hallucinated, let us know in the Issues!

Support for the production of OpenAstrocytes at Forecast was generously provided by the Special Initiatives division of the Astera Institute.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

astrocytes-0.1.1b3.tar.gz (27.3 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

astrocytes-0.1.1b3-py3-none-any.whl (23.0 kB view details)

Uploaded Python 3

File details

Details for the file astrocytes-0.1.1b3.tar.gz.

File metadata

  • Download URL: astrocytes-0.1.1b3.tar.gz
  • Upload date:
  • Size: 27.3 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.11 {"installer":{"name":"uv","version":"0.9.11"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for astrocytes-0.1.1b3.tar.gz
Algorithm Hash digest
SHA256 ec95259dd5c0fd408e70cf72eb29c77f0d87546ad28d7707f92e7b1a99a9954d
MD5 a9935c8d000c23a5fe3a771e9ffc0cb5
BLAKE2b-256 4e3abef49288a0cdf767bcb5432034fc2eb3f01b761dd2982bb79d1cff2fc36d

See more details on using hashes here.

File details

Details for the file astrocytes-0.1.1b3-py3-none-any.whl.

File metadata

  • Download URL: astrocytes-0.1.1b3-py3-none-any.whl
  • Upload date:
  • Size: 23.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.9.11 {"installer":{"name":"uv","version":"0.9.11"},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for astrocytes-0.1.1b3-py3-none-any.whl
Algorithm Hash digest
SHA256 434b92b14dc3916bdb5b3fe4054b2670c7b60f79925e7531b582444952f30e11
MD5 b37b133756bf3499bbbf0511f3041155
BLAKE2b-256 c45e217f9a4066edcc7f8347a52fd0b5ed013d8bc3a51567ad33183f80977754

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page