An attempt to speed-up access to large NWB (Neurodata Without Borders) files stored in the cloud.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

lazynwb

Efficient read-only access to tables, time series and metadata across multiple local or cloud-hosted NWB files simultaneously, without loading entire files into memory.

pip install lazynwb

Why lazynwb

With pynwb, each container type has its own access pattern - dot attributes, dictionary keys, and method calls get mixed together depending on where the data lives in the file:

# pynwb
from pynwb import NWBHDF5IO
io = NWBHDF5IO('my_file.nwb', 'r')
nwb = io.read()

nwb.units.to_dataframe()
nwb.trials.to_dataframe()
nwb.processing['behavior']['eye_tracking'].to_dataframe()
nwb.processing['ophys']['Fluorescence']['RoiResponseSeries'].data[:]

You need to know whether something is a property, a dict-like container, or a DynamicTable, and chain the right combination for each.

With lazynwb, every table is accessed the same way - by its internal path:

# lazynwb
import lazynwb

lazynwb.get_df('my_file.nwb', '/units')
lazynwb.get_df('my_file.nwb', '/intervals/trials')
lazynwb.get_df('my_file.nwb', '/processing/behavior/eye_tracking')

And for time series data:

ts = lazynwb.get_timeseries('my_file.nwb', '/processing/ophys/Fluorescence/RoiResponseSeries')
ts.data[:]

No need to know the container class or chain attribute lookups. The same path works for any file, any backend (HDF5 or Zarr), local or remote, and extends to reading across multiple files in one call.

lazynwb can also read only the columns and rows you request, rather than loading the entire table into memory first. This matters for tables with list- or array- like columns, like the units table, where spike_times and waveform_mean can be very large compared to other single-value metrics columns.

Quick start

import lazynwb

# read the trials table as a pandas DataFrame
df = lazynwb.get_df('my_file.nwb', '/intervals/trials')

Use get_internal_paths to find available paths if you're not sure what's in a file:

lazynwb.get_internal_paths('my_file.nwb')

Reading tables

As a pandas or polars DataFrame (`get_df`)

Returns a pandas DataFrame by default:

df = lazynwb.get_df('my_file.nwb', '/units')

Return a polars DataFrame instead:

df = lazynwb.get_df('my_file.nwb', '/units', as_polars=True)

Select specific columns:

df = lazynwb.get_df('my_file.nwb', '/units', include_column_names=['unit_id', 'location'])

Exclude specific columns:

df = lazynwb.get_df('my_file.nwb', '/units', exclude_column_names=['waveform_mean'])

Large array columns like spike_times and waveform_mean are excluded by default (exclude_array_columns=True). Include them explicitly:

df = lazynwb.get_df('my_file.nwb', '/units', exclude_array_columns=False)

Read a table across multiple files into a single DataFrame:

df = lazynwb.get_df(
    ['file_1.nwb', 'file_2.nwb', 'file_3.nwb'],
    '/intervals/trials',
)

Each row gets _nwb_path, _table_path and _table_index columns to identify its source file and original row index.

As a Polars LazyFrame (`scan_nwb`)

scan_nwb returns a polars.LazyFrame that reads data on demand. Only the columns and rows you actually use are fetched from disk or the network, which makes it useful for large files or files on cloud storage.

import lazynwb
import polars as pl

lf = lazynwb.scan_nwb('my_file.nwb', '/units')

# filter rows and select columns - only the needed data is read
df = (
    lf
    .filter(pl.col('presence_ratio') >= 0.9)
    .select('unit_id', 'location', 'spike_times')
    .collect()
)

Read across multiple files:

lf = lazynwb.scan_nwb(
    ['file_1.nwb', 'file_2.nwb'],
    '/units',
)
df = (
    lf
    .filter(
        pl.col('amplitude_cutoff') <= 0.1,
        pl.col('isi_violations_ratio') <= 0.5,
    )
    .select('unit_id', 'location', 'spike_times', '_nwb_path')
    .collect()
)

Control schema inference when files have slightly different column types:

lf = lazynwb.scan_nwb(
    nwb_paths,
    '/units',
    infer_schema_length=5,               # only read first 5 files for schema
    schema_overrides={'unit_id': pl.Int64},  # force a column type
)

There's also read_nwb, which is the same as scan_nwb(...).collect():

df = lazynwb.read_nwb(nwb_paths, '/units')  # returns pl.DataFrame

Using `LazyNWB` (PyNWB-like interface)

Access tables and metadata from a single file with familiar attribute names:

nwb = lazynwb.LazyNWB('my_file.nwb')

# tables (returned as pandas DataFrames)
nwb.trials
nwb.units
nwb.epochs
nwb.electrodes

# metadata
nwb.session_id
nwb.session_start_time
nwb.session_description
nwb.identifier
nwb.experiment_description
nwb.experimenter
nwb.lab
nwb.institution
nwb.keywords

Subject metadata:

nwb.subject.age
nwb.subject.sex
nwb.subject.species
nwb.subject.genotype
nwb.subject.subject_id
nwb.subject.strain
nwb.subject.date_of_birth

Get a table as polars:

df = nwb.get_df('/units', as_polars=True)

Get a summary of everything in the file:

nwb.describe()
# {'identifier': '...', 'session_id': '...', ..., 'paths': ['/acquisition/...', '/units', ...]}

Time series

Get a single time series by searching for a name:

ts = lazynwb.get_timeseries('my_file.nwb', search_term='running_speed')

ts.data          # h5py.Dataset or zarr.Array (lazy - not loaded until sliced)
ts.timestamps    # h5py.Dataset or zarr.Array
ts.unit          # e.g. 'cm/s'
ts.rate          # sampling rate, if available
ts.description

Get a time series by exact internal path:

ts = lazynwb.get_timeseries('my_file.nwb', exact_path=True, search_term='/acquisition/lick_sensor_events')

Get all time series in the file:

all_ts = lazynwb.get_timeseries('my_file.nwb', match_all=True)
# dict: {'/acquisition/lick_sensor_events': TimeSeries(...), '/processing/behavior/running_speed': TimeSeries(...), ...}

Also available on a LazyNWB object:

nwb = lazynwb.LazyNWB('my_file.nwb')
ts = nwb.get_timeseries('running_speed')

Metadata across files

Get session and subject metadata for many files at once:

df = lazynwb.get_metadata_df(nwb_paths)  # pandas DataFrame

df = lazynwb.get_metadata_df(nwb_paths, as_polars=True)  # polars DataFrame

Returns columns including identifier, session_id, session_start_time, session_description, subject_id, age, sex, species, genotype, strain, date_of_birth, _nwb_path, and more.

File contents and schema

Discover internal paths

See what's inside an NWB file:

paths = lazynwb.get_internal_paths('my_file.nwb')
# {'/acquisition/lick_sensor_events/data': <HDF5 dataset ...>,
#  '/intervals/trials': <HDF5 group ...>,
#  '/units': <HDF5 group ...>,
#  ...}

Get table schema

Get the unified column names and types for a table across multiple files:

schema = lazynwb.get_table_schema(nwb_paths, '/intervals/trials')
# OrderedDict([('condition', String), ('id', Int64), ('start_time', Float64), ...])

Uses polars (Arrow) data types.

Format conversion

Export NWB tables to other file formats with convert_nwb_tables.

Supported formats: parquet, csv, json, excel, feather, arrow, avro, delta.

output_paths = lazynwb.convert_nwb_tables(
    nwb_paths,
    output_dir='./output',
    output_format='parquet',
)
# {'/intervals/trials': PosixPath('./output/trials.parquet'),
#  '/units': PosixPath('./output/units.parquet')}

Pass format-specific options via keyword arguments:

# parquet with zstd compression
lazynwb.convert_nwb_tables(nwb_paths, './output', output_format='parquet', compression='zstd')

# csv with custom separator
lazynwb.convert_nwb_tables(nwb_paths, './output', output_format='csv', separator='\t')

# json, pretty-printed
lazynwb.convert_nwb_tables(nwb_paths, './output', output_format='json', pretty=True)

Only export tables present in all files:

lazynwb.convert_nwb_tables(nwb_paths, './output', min_file_count=len(nwb_paths))

Use full internal paths as filenames (e.g. intervals_trials.parquet instead of trials.parquet):

lazynwb.convert_nwb_tables(nwb_paths, './output', full_path=True)

SQL queries

ctx = lazynwb.get_sql_context(nwb_paths)
df = ctx.execute("SELECT unit_id, location FROM units WHERE presence_ratio > 0.9").collect()

Cloud and remote files

All functions accept S3, GCS, Azure Blob Storage and HTTP/HTTPS paths in addition to local file paths:

# S3
df = lazynwb.get_df('s3://my-bucket/data/file.nwb', '/units')

# Google Cloud Storage
df = lazynwb.get_df('gs://my-bucket/data/file.nwb', '/units')

# Azure Blob Storage
df = lazynwb.get_df('az://my-container/data/file.nwb', '/units')

# HTTP/HTTPS
df = lazynwb.get_df('https://example.com/data/file.nwb', '/units')

Configure cloud access via lazynwb.file_io.config:

from lazynwb.file_io import config

config.use_obstore = True                          # use obstore for S3/GCS/Azure (default: True)
config.use_remfile = False                         # use remfile for HTTP byte-range requests (default: False)
config.fsspec_storage_options = {"anon": True}     # e.g. anonymous S3 access
config.disable_cache = False                       # disable FileAccessor caching (default: False)

DANDI archive

Scan a table across all NWB files in a DANDI dandiset:

lf = lazynwb.scan_dandiset(
    dandiset_id='000363',
    table_path='/units',
    version='0.231012.2129',
    max_assets=10,  # limit number of files (useful for testing)
)
df = lf.collect()

Filter which assets to include:

lf = lazynwb.scan_dandiset(
    '000363',
    '/units',
    asset_filter=lambda asset: 'probe' in asset['path'],
)

Get S3 URLs for all NWB files in a dandiset:

urls = lazynwb.get_dandiset_s3_urls('000363')

Open a single DANDI asset:

accessor = lazynwb.from_dandi_asset(
    dandiset_id='000363',
    asset_id='21c622b7-6d8e-459b-98e8-b968a97a1585',
)

Internal columns

When reading tables from multiple files, three columns are added automatically:

Column	Description
`_nwb_path`	Path to the source NWB file
`_table_path`	Internal path of the table (e.g. `/units`)
`_table_index`	Row index in the original table

These are available as constants: lazynwb.NWB_PATH_COLUMN_NAME, lazynwb.TABLE_PATH_COLUMN_NAME, lazynwb.TABLE_INDEX_COLUMN_NAME.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

bjhardcastle

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.2.89

Apr 15, 2026

0.2.88

Apr 11, 2026

0.2.87

Apr 10, 2026

0.2.86

Mar 28, 2026

0.2.85

Mar 28, 2026

0.2.84

Mar 20, 2026

This version

0.2.83

Mar 4, 2026

0.2.82

Mar 4, 2026

0.2.81

Feb 24, 2026

0.2.80

Feb 24, 2026

0.2.79

Feb 23, 2026

0.2.77

Dec 29, 2025

0.2.76

Nov 7, 2025

0.2.75

Sep 11, 2025

0.2.74

Aug 25, 2025

0.2.73

Aug 18, 2025

0.2.72

Aug 13, 2025

0.2.71

Aug 7, 2025

0.2.70

Aug 7, 2025

0.2.69

Aug 7, 2025

0.2.68

Aug 7, 2025

0.2.67

Aug 6, 2025

0.2.66

Aug 3, 2025

0.2.65

Jul 31, 2025

0.2.64

Jul 31, 2025

0.2.63

Jul 31, 2025

0.2.62

Jul 22, 2025

0.2.61

Jul 22, 2025

0.2.60

Jul 15, 2025

0.2.59

Jul 10, 2025

0.2.58

Jun 26, 2025

0.2.57

Jun 14, 2025

0.2.56

Jun 9, 2025

0.2.55

Jun 3, 2025

0.2.54

May 23, 2025

0.2.53

May 23, 2025

0.2.52

May 22, 2025

0.2.51

May 15, 2025

0.2.50

May 11, 2025

0.2.49

May 9, 2025

0.2.48

May 9, 2025

0.2.47

May 9, 2025

0.2.46

May 9, 2025

0.2.45

May 8, 2025

0.2.44

May 8, 2025

0.2.43

May 8, 2025

0.2.42

May 8, 2025

0.2.41

May 8, 2025

0.2.40

Apr 30, 2025

0.2.39

Apr 29, 2025

0.2.38

Apr 29, 2025

0.2.37

Apr 29, 2025

0.2.36

Apr 29, 2025

0.2.35

Apr 29, 2025

0.2.34

Apr 28, 2025

0.2.33

Apr 26, 2025

0.2.32

Apr 26, 2025

0.2.31

Apr 26, 2025

0.2.30

Apr 26, 2025

0.2.29

Apr 26, 2025

0.2.28

Apr 26, 2025

0.2.27

Apr 26, 2025

0.2.26

Apr 26, 2025

0.2.25

Apr 26, 2025

0.2.24

Apr 24, 2025

0.2.23

Apr 24, 2025

0.2.22

Apr 24, 2025

0.2.21

Apr 24, 2025

0.2.20

Apr 23, 2025

0.2.19

Apr 23, 2025

0.2.18

Apr 23, 2025

0.2.17

Apr 23, 2025

0.2.16

Apr 23, 2025

0.2.15

Apr 23, 2025

0.2.14

Apr 23, 2025

0.2.13

Apr 23, 2025

0.2.12

Apr 23, 2025

0.2.11

Apr 22, 2025

0.2.10

Apr 22, 2025

0.2.9

Apr 21, 2025

0.2.8

Apr 21, 2025

0.2.7

Apr 21, 2025

0.2.6

Apr 18, 2025

0.2.5

Apr 18, 2025

0.2.4

Apr 18, 2025

0.2.3

Apr 18, 2025

0.2.2

Apr 11, 2025

0.2.1

Apr 11, 2025

0.1.0

Apr 11, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lazynwb-0.2.83.tar.gz (59.0 kB view details)

Uploaded Mar 4, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

lazynwb-0.2.83-py3-none-any.whl (48.5 kB view details)

Uploaded Mar 4, 2026 Python 3

File details

Details for the file lazynwb-0.2.83.tar.gz.

File metadata

Download URL: lazynwb-0.2.83.tar.gz
Upload date: Mar 4, 2026
Size: 59.0 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lazynwb-0.2.83.tar.gz
Algorithm	Hash digest
SHA256	`07499b7166411b59c96398f77ebe7538c58f9444ecbb6d8f7cfdae8a6fa90fd1`
MD5	`0e79c0e04bfdd83edcb662e51d05ff33`
BLAKE2b-256	`fa6758dd964dd26931d21f8efca291b807e02a33550b37d227ff1274fa71f8bb`

See more details on using hashes here.

File details

Details for the file lazynwb-0.2.83-py3-none-any.whl.

File metadata

Download URL: lazynwb-0.2.83-py3-none-any.whl
Upload date: Mar 4, 2026
Size: 48.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: uv/0.10.8 {"installer":{"name":"uv","version":"0.10.8","subcommand":["publish"]},"python":null,"implementation":{"name":null,"version":null},"distro":{"name":"Ubuntu","version":"24.04","id":"noble","libc":null},"system":{"name":null,"release":null},"cpu":null,"openssl_version":null,"setuptools_version":null,"rustc_version":null,"ci":true}

File hashes

Hashes for lazynwb-0.2.83-py3-none-any.whl
Algorithm	Hash digest
SHA256	`98d5ac63136d56d5dadb357cc7d1edc27b0f406748f145da3ef542816c76d816`
MD5	`7106fee05e361dade83a893bf452dda4`
BLAKE2b-256	`af4bb8a0dbf3e9a0b5fe5e28f4513872640e06eb13faae05c876b8dac84dcd09`

See more details on using hashes here.

lazynwb 0.2.83

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

lazynwb

Why lazynwb

Quick start

Reading tables

As a pandas or polars DataFrame (get_df)

As a Polars LazyFrame (scan_nwb)

Using LazyNWB (PyNWB-like interface)

Time series

Metadata across files

File contents and schema

Discover internal paths

Get table schema

Format conversion

SQL queries

Cloud and remote files

DANDI archive

Internal columns

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes

As a pandas or polars DataFrame (`get_df`)

As a Polars LazyFrame (`scan_nwb`)

Using `LazyNWB` (PyNWB-like interface)