Skip to main content

A fast, chunked, memory-bounded Rust engine for electrophysiology (Neuropixels) signal processing, with Python bindings.

Project description

Segovia — a fast, memory-bounded Rust engine for electrophysiology signal processing, Neuropixels-scale, callable from Python

CI crates.io PyPI docs.rs License: AGPL-3.0-or-later Status: pre-release PRs welcome

A fast, chunked, memory-bounded Rust engine for electrophysiology signal processing — Neuropixels-scale, callable from Python.

Segovia is a lazy-evaluated, chunked, concurrent compute engine for massive multi-channel electrophysiology time-series (Neuropixels-scale: 30 kHz × thousands of channels). It is written in Rust, exposed to Python via PyO3, and built to slot into the existing neuroscience stack — SpikeInterface, SpikeGLX, Zarr, and NWB — rather than replace it. The aim is out-of-core, bounded-memory streaming preprocessing (bandpass filtering, common-median referencing, whitening) with GIL-released shared-memory threads instead of the process-pool / pickle / per-process-copy model that makes Python spike-sorting pipelines run out of memory.

Status

Early development — pre-MVP. The first functional pieces have shipped: two chunked, memory-bounded readers that stream a recording as (samples, channels) int16 chunks behind a shared ChunkSource contract — a SpikeGLX .meta/.bin reader (segovia.SpikeGlxReader, v0.1.0) and a Zarr reader (segovia.ZarrReader, gzip/zstd/blosc), published to crates.io and PyPIpip install segovia works. The compute engine (the bandpass → CMR → whiten chain) is not built yet, so parts of the quickstart below still describe the target API. The whole premise rests on one make-or-break benchmark — see The benchmark gate. Follow the roadmap for progress.

Contents

Why Segovia

A neuroscience lab can record a brain faster than its software can read it back. A single high-density Neuropixels probe writes roughly 80 GB/hour (~22 MB/s); standard Python pipelines load that at double size and then copy it wholesale into every worker process. Documented failures include a 26 GiB memory error filtering a modest recording and a 102 GiB blow-up during motion correction. The data is fine — the plumbing leaks.

Segovia targets that plumbing. It is CPU-first (the workload is IO/memory-bound, so a GPU would spend more time waiting on the PCIe bus than computing), reuses mature Rust storage crates (zarrs, hdf5-metno, arrow-rs) instead of reinventing them, and earns its keep through one concrete advantage: true shared-memory threading in Rust with the GIL released. This is out-of-core spike-sorting preprocessing — bounded memory regardless of recording length, real-time capable, and callable from the Python tools researchers already use.

How it works

flowchart LR
    A["Storage<br/>SpikeGLX .bin · Zarr · NWB/HDF5"] --> B["Chunked source<br/>channels × samples tiles"]
    B --> C["Op chain<br/>bandpass → CMR → whiten → detect"]
    C --> D["Sink<br/>zero-copy NumPy / Arrow"]
    C -. "Rayon over chunks, GIL released" .-> C
    style A fill:#0B1020,stroke:#5A6B8C,color:#F5F7FA
    style B fill:#0B1020,stroke:#5A6B8C,color:#F5F7FA
    style C fill:#0B1020,stroke:#CE422B,color:#F5F7FA
    style D fill:#0B1020,stroke:#DEA584,color:#F5F7FA

Data is read in chunks (spans of channels × samples), streamed through an operation chain, and returned to Python zero-copy. Only a bounded window is ever resident in memory — the metaphor is the Aqueduct of Segovia, a continuous stream carried span-by-span across a row of stone arches.

Install

Not yet published — this is the planned install once the first release ships.

pip install segovia
cargo add segovia

Quickstart

Target API (illustrative, not yet shipped). Read a SpikeGLX recording, run the bandpass → common-median-reference → whiten chain in bounded memory, and get a zero-copy NumPy result.

import segovia

recording = segovia.read_spikeglx("data/probe0.imec0.ap.bin")

filtered = (
    recording
    .bandpass(low=300, high=6000)
    .common_median_reference()
    .whiten()
)

chunk = filtered.to_numpy(start=0, end=30_000)

The benchmark gate

Segovia's existence hinges on one measurable claim (call it SC1): on a real 1-hour Neuropixels recording, the Rust bandpass + CMR + whiten chain must run in under 2 GB of peak memory and be faster than the equivalent spikeinterface(n_jobs=N) call on Windows and macOS. If that cannot be shown, the premise is wrong and the project says so. This benchmark is built first, not last. Result: pending — see the roadmap.

Architecture

The full architecture document set lives in docs/architecture/:

Roadmap

ROADMAP.md is the single source of truth for version and scope. In short: learn the domain and de-risk the toolchain (M0–2), prove the benchmark win (M2–4, the go/no-go gate), grow into a real engine with a Python API (M4–7), add breadth and correctness (M7–10), and ship as a SpikeInterface preprocessing backend (M10–12). A deferred, gated single-cell vertical sits beyond that — see docs/future/leukemia-direction.md.

Why the name

Segovia is named for Claudio Segovia, a friend who died of leukemia at 26. The name also evokes the Aqueduct of Segovia — a continuous stream carried across a long row of segmented stone arches, which is exactly this engine's chunked, span-by-span streaming model.

The connection is honest, not a marketing claim. An electrophysiology engine does not cure cancer, and saying otherwise would be dishonest. But the underlying computational problem — data too large for memory, and a Python layer that copies it until it chokes — is shared with single-cell genomics, the computational backbone of modern leukemia research (clonal evolution, drug resistance, CAR-T). Segovia's core is kept domain-neutral so the same machinery could one day help with that work too: aided by the tool, not a tool made for it. That direction is deliberately deferred and gated — the honest details, including disconfirming evidence, are in docs/future/leukemia-direction.md.

Contributing

Contributions are welcome — see CONTRIBUTING.md. The project is Windows-first, uses a Rust + PyO3 + maturin toolchain, conventional commits, and STAR-format PRs.

Citation

If you use Segovia in your research, please cite it via CITATION.cff (GitHub shows a "Cite this repository" button). A DOI will be added on the first archived release.

License

Segovia is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later).

This is deliberate: Segovia is free for everyone — researchers, individuals, and non-profits — and the copyleft terms keep it that way. Anyone who distributes Segovia, or runs a modified version as a network service, must release their complete corresponding source under the same license, so the project cannot be taken closed-source or proprietary.

Unless you explicitly state otherwise, any contribution you submit for inclusion is licensed under AGPL-3.0-or-later, without any additional terms or conditions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

segovia-0.2.0.tar.gz (215.0 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

segovia-0.2.0-cp38-abi3-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.8+Windows x86-64

segovia-0.2.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

segovia-0.2.0-cp38-abi3-macosx_11_0_arm64.whl (2.2 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file segovia-0.2.0.tar.gz.

File metadata

  • Download URL: segovia-0.2.0.tar.gz
  • Upload date:
  • Size: 215.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for segovia-0.2.0.tar.gz
Algorithm Hash digest
SHA256 571dc58e70ccc2becb34ad48d792ce2a8a5d1659ad233b1a4477c9136379ed5c
MD5 ae95546399822a546a683435a540d11e
BLAKE2b-256 de0d3a9251fa32362085465b45d6a8a6278e41d7abd338e093d35589c5ba328d

See more details on using hashes here.

Provenance

The following attestation bundles were made for segovia-0.2.0.tar.gz:

Publisher: release.yml on fcarvajalbrown/Segovia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file segovia-0.2.0-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: segovia-0.2.0-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for segovia-0.2.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 8daf0412566d016ce9c2e86329c88d944eba81f5b39179a70363f5b64d15a97b
MD5 86622facf5cc804e726d8d3deb180cd3
BLAKE2b-256 e6724f394d1b7ced960abd60570e3f4587b769ff4c0ac01b6c10fe256766bb02

See more details on using hashes here.

Provenance

The following attestation bundles were made for segovia-0.2.0-cp38-abi3-win_amd64.whl:

Publisher: release.yml on fcarvajalbrown/Segovia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file segovia-0.2.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for segovia-0.2.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cb5cf0225cba5a205f1bf3d878d1e7edf53b477434bbcc365678e039d12480d0
MD5 756a0475f78a70611e263e6565657575
BLAKE2b-256 b60be46bd2beb8610eb29f991199f1642b91791ac813c84241365c2b33b95dc1

See more details on using hashes here.

Provenance

The following attestation bundles were made for segovia-0.2.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on fcarvajalbrown/Segovia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file segovia-0.2.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for segovia-0.2.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 6b29ee712abbd3f9a4d4f489ab5e561345bc41568d7de9c68584b7849632de2e
MD5 f99616548e212c4fc6edca9a3f80c79e
BLAKE2b-256 845d12af05cf6ee5a3389bda1fbb1957e55651bb4813d67f2c64b40feab19b42

See more details on using hashes here.

Provenance

The following attestation bundles were made for segovia-0.2.0-cp38-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on fcarvajalbrown/Segovia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page