Skip to main content

A fast, chunked, memory-bounded Rust engine for electrophysiology (Neuropixels) signal processing, with Python bindings.

Project description

Segovia — a fast, memory-bounded Rust engine for electrophysiology signal processing, Neuropixels-scale, callable from Python

CI crates.io PyPI docs.rs License: AGPL-3.0-or-later Status: pre-release PRs welcome

A fast, chunked, memory-bounded Rust engine for electrophysiology signal processing — Neuropixels-scale, callable from Python.

Segovia is a lazy-evaluated, chunked, concurrent compute engine for massive multi-channel electrophysiology time-series (Neuropixels-scale: 30 kHz × thousands of channels). It is written in Rust, exposed to Python via PyO3, and built to slot into the existing neuroscience stack — SpikeInterface, SpikeGLX, Zarr, and NWB — rather than replace it. The aim is out-of-core, bounded-memory streaming preprocessing (bandpass filtering, common-median referencing, whitening) with GIL-released shared-memory threads instead of the process-pool / pickle / per-process-copy model that makes Python spike-sorting pipelines run out of memory.

Status

Early development — pre-MVP. The first functional pieces have shipped: two chunked, memory-bounded readers that stream a recording as (samples, channels) int16 chunks behind a shared ChunkSource contract — a SpikeGLX .meta/.bin reader (segovia.SpikeGlxReader, v0.1.0) and a Zarr reader (segovia.ZarrReader, gzip/zstd/blosc), published to crates.io and PyPIpip install segovia works. The compute engine (the bandpass → CMR → whiten chain) is not built yet, so parts of the quickstart below still describe the target API. The whole premise rests on one make-or-break benchmark — see The benchmark gate. Follow the roadmap for progress.

Contents

Why Segovia

A neuroscience lab can record a brain faster than its software can read it back. A single high-density Neuropixels probe writes roughly 80 GB/hour (~22 MB/s); standard Python pipelines load that at double size and then copy it wholesale into every worker process. Documented failures include a 26 GiB memory error filtering a modest recording and a 102 GiB blow-up during motion correction. The data is fine — the plumbing leaks.

Segovia targets that plumbing. It is CPU-first (the workload is IO/memory-bound, so a GPU would spend more time waiting on the PCIe bus than computing), reuses mature Rust storage crates (zarrs, hdf5-metno, arrow-rs) instead of reinventing them, and earns its keep through one concrete advantage: true shared-memory threading in Rust with the GIL released. This is out-of-core spike-sorting preprocessing — bounded memory regardless of recording length, real-time capable, and callable from the Python tools researchers already use.

How it works

flowchart LR
    A["Storage<br/>SpikeGLX .bin · Zarr · NWB/HDF5"] --> B["Chunked source<br/>channels × samples tiles"]
    B --> C["Op chain<br/>bandpass → CMR → whiten → detect"]
    C --> D["Sink<br/>zero-copy NumPy / Arrow"]
    C -. "Rayon over chunks, GIL released" .-> C
    style A fill:#0B1020,stroke:#5A6B8C,color:#F5F7FA
    style B fill:#0B1020,stroke:#5A6B8C,color:#F5F7FA
    style C fill:#0B1020,stroke:#CE422B,color:#F5F7FA
    style D fill:#0B1020,stroke:#DEA584,color:#F5F7FA

Data is read in chunks (spans of channels × samples), streamed through an operation chain, and returned to Python zero-copy. Only a bounded window is ever resident in memory — the metaphor is the Aqueduct of Segovia, a continuous stream carried span-by-span across a row of stone arches.

Install

Not yet published — this is the planned install once the first release ships.

pip install segovia
cargo add segovia

Quickstart

Target API (illustrative, not yet shipped). Read a SpikeGLX recording, run the bandpass → common-median-reference → whiten chain in bounded memory, and get a zero-copy NumPy result.

import segovia

recording = segovia.read_spikeglx("data/probe0.imec0.ap.bin")

filtered = (
    recording
    .bandpass(low=300, high=6000)
    .common_median_reference()
    .whiten()
)

chunk = filtered.to_numpy(start=0, end=30_000)

The benchmark gate

Segovia's existence hinges on one measurable claim (call it SC1): on a real 1-hour Neuropixels recording, the Rust bandpass + CMR + whiten chain must run in under 2 GB of peak memory and be faster than the equivalent spikeinterface(n_jobs=N) call on Windows and macOS. If that cannot be shown, the premise is wrong and the project says so. This benchmark is built first, not last. Result: pending — see the roadmap.

Architecture

The full architecture document set lives in docs/architecture/:

Roadmap

ROADMAP.md is the single source of truth for version and scope. In short: learn the domain and de-risk the toolchain (M0–2), prove the benchmark win (M2–4, the go/no-go gate), grow into a real engine with a Python API (M4–7), add breadth and correctness (M7–10), and ship as a SpikeInterface preprocessing backend (M10–12). A deferred, gated single-cell vertical sits beyond that — see docs/future/leukemia-direction.md.

Why the name

Segovia is named for Claudio Segovia, a friend who died of leukemia at 26. The name also evokes the Aqueduct of Segovia — a continuous stream carried across a long row of segmented stone arches, which is exactly this engine's chunked, span-by-span streaming model.

The connection is honest, not a marketing claim. An electrophysiology engine does not cure cancer, and saying otherwise would be dishonest. But the underlying computational problem — data too large for memory, and a Python layer that copies it until it chokes — is shared with single-cell genomics, the computational backbone of modern leukemia research (clonal evolution, drug resistance, CAR-T). Segovia's core is kept domain-neutral so the same machinery could one day help with that work too: aided by the tool, not a tool made for it. That direction is deliberately deferred and gated — the honest details, including disconfirming evidence, are in docs/future/leukemia-direction.md.

Contributing

Contributions are welcome — see CONTRIBUTING.md. The project is Windows-first, uses a Rust + PyO3 + maturin toolchain, conventional commits, and STAR-format PRs.

Citation

If you use Segovia in your research, please cite it via CITATION.cff (GitHub shows a "Cite this repository" button). A DOI will be added on the first archived release.

License

Segovia is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later).

This is deliberate: Segovia is free for everyone — researchers, individuals, and non-profits — and the copyleft terms keep it that way. Anyone who distributes Segovia, or runs a modified version as a network service, must release their complete corresponding source under the same license, so the project cannot be taken closed-source or proprietary.

Unless you explicitly state otherwise, any contribution you submit for inclusion is licensed under AGPL-3.0-or-later, without any additional terms or conditions.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

segovia-0.3.0.tar.gz (223.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

segovia-0.3.0-cp38-abi3-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.8+Windows x86-64

segovia-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.8+manylinux: glibc 2.17+ x86-64

segovia-0.3.0-cp38-abi3-macosx_11_0_arm64.whl (2.2 MB view details)

Uploaded CPython 3.8+macOS 11.0+ ARM64

File details

Details for the file segovia-0.3.0.tar.gz.

File metadata

  • Download URL: segovia-0.3.0.tar.gz
  • Upload date:
  • Size: 223.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for segovia-0.3.0.tar.gz
Algorithm Hash digest
SHA256 ead407f5aa450c1e27ee64ecb2c65611b0401d6b01af4f28006ecba379a103a5
MD5 04d6bb98c3f51802b38d51cb56abe2f7
BLAKE2b-256 41352ee9bf9a69131e7a7796e3a9c97325a4cbb9f1bf2291b67c643be0416697

See more details on using hashes here.

Provenance

The following attestation bundles were made for segovia-0.3.0.tar.gz:

Publisher: release.yml on fcarvajalbrown/Segovia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file segovia-0.3.0-cp38-abi3-win_amd64.whl.

File metadata

  • Download URL: segovia-0.3.0-cp38-abi3-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.8+, Windows x86-64
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for segovia-0.3.0-cp38-abi3-win_amd64.whl
Algorithm Hash digest
SHA256 af25d1a7c690d480b1e0710e4bcacc61bdd49ae48ed39fe4536fd1fc40ef4b95
MD5 5ad53fd5dc17d88602cad8625cb906cf
BLAKE2b-256 0e4d64896dac3dc0e70952db4cc0005382ec2b8bc714df942085c7932fffc1b6

See more details on using hashes here.

Provenance

The following attestation bundles were made for segovia-0.3.0-cp38-abi3-win_amd64.whl:

Publisher: release.yml on fcarvajalbrown/Segovia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file segovia-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for segovia-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3027d1a987d3f2407671bf51eea8550c7bb6e9d2bfbcc57fe491a49ad7e0efea
MD5 9734bce4011df19b32be6770418eed73
BLAKE2b-256 3bd5d5ca98395c036a8c012d8c2fe762ddd21629f6348a0703d02b2fb2165a28

See more details on using hashes here.

Provenance

The following attestation bundles were made for segovia-0.3.0-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl:

Publisher: release.yml on fcarvajalbrown/Segovia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file segovia-0.3.0-cp38-abi3-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for segovia-0.3.0-cp38-abi3-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f1953e9d36a7b11b2189298d766524378a87a5f3a14384eddd3c29f44dcb7a86
MD5 2c67d96370b9b113e802b80acbcffcfe
BLAKE2b-256 9ee7d097e94d5564e263b9d5e652c919f99f342e9785b0db5a3d063bb2bd71d9

See more details on using hashes here.

Provenance

The following attestation bundles were made for segovia-0.3.0-cp38-abi3-macosx_11_0_arm64.whl:

Publisher: release.yml on fcarvajalbrown/Segovia

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page