Skip to main content

DARTsort

Project description

ci coveralls Zenodo DOI pypi: dartsort

dartsort

dartsort is a modular spike sorter built around a statistical clustering model and a new approach to probe motion. It is also a toolkit of modules for building spike sorters or other analyses of electrophysiology data.

:warning: work in progress :warning:

We do not currently recommend DARTsort for production spike sorting purposes. Please feel free to open an issue or a discussion if you run into problems.

Installation

Installing into an existing environment

If you already have a Python environment with PyTorch working and you just want to get dartsort there, use

$ pip install dartsort

If you want to run the test suite or use dartsort.vis, you can install the optional dependencies with pip install dartsort[test,vis].

If you need a Python environment, expand the next section.

Setting up a Python environment

Otherwise, there are a few ways to get Python and PyTorch set up, including new tools like uv, but I find that a conda-forge-based distribution is still the most reliable at installing the GPU dependencies which PyTorch needs (note: conda-forge is different from the non-free Anaconda).

You can use conda-forge to install Python, dartsort, and its dependencies as follows:

  • Follow the conda-forge installation instructions for your platform at https://conda-forge.org/download/
  • Create an environment with
    $ mamba env create -f environment.yml
    
    This will create an environment called dartsort, but you can change the name by adding -n othername.
  • Activate the environment:
    $ mamba activate dartsort
    
  • Install dartsort and the rest of its dependencies by running the pip command above

Usage

As a Python function

dartsort can be run from inside Python with:

import dartsort

dartsort.dartsort(recording, output_dir)

# or, to set configuration options, something like...
dartsort_result = dartsort.dartsort(
    recording,
    output_dir,
    cfg=dartsort.DARTsortUserConfig(
        preprocessing="ibllike",
        work_in_tmpdir=True,
        copy_recording_to_tmpdir=True,
    ),
)

Please read the important configuration details section below. Some of them, like preprocessing, are not set by default and need your input! (This could change.)

Here, recording is a SpikeInterface recording object. SpikeInterface can read every electrophysiology data format that I've encountered and many I haven't; see

output_dir is the folder where dartsort will save its output.

Once you've run dartsort, you might want to check out the outputs and exporting section below.

Important configuration details

Before running dartsort, please be aware of the following important configuration options.

  • preprocessing: dartsort won't touch your data by default (preprocessing="none"), leaving you free to implement your own preprocessing in SpikeInterface or otherwise, and therefore dartsort will explode if you don't set this flag and leave your data in its original raw state (for instance, the raw int16 data off the probe).
    • For a cheap but sensible default, try preprocessing="ibllikecmr", which applies a pipeline similar to that of the IBL but with global median common referencing instead of their spatial highpass filter. preprocessing="ibllike" will use their spatial highpass filter.
  • do_motion_estimation=True by default, and you may like to disable it if you know for a fact that there is (say) less than 5 microns of total drift in your recording, or if you have handled this in your own preprocessing (which is discouraged, since dartsort has its own approach.)
  • work_in_tmpdir and copy_recording_to_tmpdir can be helpful in some cases where slow network drives are involved.

Outputs and exporting

The dartsort_result = dartsort(...) function returns a dictionary dartsort_result containing a DARTsortSorting object under the key sorting = dartsort_result["sorting"]. This object has all the spike train data attached (as arrays under property names .times_samples and .times_seconds for spike times in samples and seconds, .labels for unit labels, and many others; print(sorting) to see some more).

If you already ran dartsort and want to load the output spike trains, use dartsort.load(output_dir) to get the DARTsortSorting object.

This object can also export itself to other formats:

  • For a SpikeInterface NumpySorting object, use sorting.to_numpy_sorting()
  • For a Pynapple TsGroup, use sorting.to_tsgroup()
  • To export to Phy, we currently suggest bridging through SpikeInterface. Start with sorting.to_numpy_sorting() and follow the instructions in SpikeInterface's documentation for first creating a SortingAnalyzer and then exporting that to Phy.
  • For a simple dictionary of numpy arrays, use dict = sorting.spike_feature_dict.
  • For pandas, use sorting.to_pandas().

dartsort also saves motion information, returned as dartsort_result["motion"] or loaded after the fact as dartsort.try_load_motion_info(output_dir).

The data is saved to output_dir in the following files:

  • dartsort_sorting.npz: This NPZ file contains the final spike train under the keys times_samples, channels, and labels.
  • matching1.h5: This HDF5 file contains spike features and other data from the last matching step. Amplitudes, localizations, and other features live in here; use h5ls on the command line to see what's in there. Be aware that the labels dataset in this HDF5 is not the same as what's saved in the dartsort_sorting.npz.
  • motion_info.pkl is a pickled MotionInfo object.
  • There may be a models/ folder containing PyTorch weights files with modeling quantities (for instance, featurization SVD bases, localization neural nets, Gaussian mixture model parameters). If you want to load these up, feel free to reach out for help.

Visualization

To make some basic visualizations of the sorting result with matplotlib, try:

import dartsort.vis as dartvis

# gather outputs from dartsort
dartsort_result = dartsort(recording, output_dir, ...)
sorting = dartsort_result["sorting"]
motion = dartsort_result["motion"]

# or, if you already ran it
sorting = dartsort.load(output_dir)
motion = dartsort.try_load_motion_info(output_dir)

dartvis.visualize_sorting(
    recording,
    sorting,
    vis_save_dir,
    motion=motion,
    make_unit_summaries=False,
)

Set make_unit_summaries=True to create a summary plot for each unit.

Command-line interface

Try running

$ dartsort -h

on your command line to see usage instructions; parameters can be configured on the command line or read from a TOML file.

Troubleshooting

Please let us know if you run into any issues. If you feel that the issue is a software bug, feel free to open an issue or a discussion on GitHub. If it's more of a data-related or methodology thing, feel free to use the email on my GitHub profile.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dartsort-0.5.2.tar.gz (4.2 MB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dartsort-0.5.2-py3-none-any.whl (2.5 MB view details)

Uploaded Python 3

File details

Details for the file dartsort-0.5.2.tar.gz.

File metadata

  • Download URL: dartsort-0.5.2.tar.gz
  • Upload date:
  • Size: 4.2 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dartsort-0.5.2.tar.gz
Algorithm Hash digest
SHA256 0b2ded2def894f26bdef0742303a35640fe00a4c3b4106940d2dcf590fe3355c
MD5 ea31b521007fffb9660f094ca9e05a2a
BLAKE2b-256 e4c4e2b43b83a397892e42b251109a1cbed4483ef253bf10b1d6d1bec1a1edc2

See more details on using hashes here.

Provenance

The following attestation bundles were made for dartsort-0.5.2.tar.gz:

Publisher: deploy.yml on cwindolf/dartsort

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file dartsort-0.5.2-py3-none-any.whl.

File metadata

  • Download URL: dartsort-0.5.2-py3-none-any.whl
  • Upload date:
  • Size: 2.5 MB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for dartsort-0.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 7f4a8d8cc49fa190b68c6bbe7492df3b2d6f7c92a75e53bc8c6b33671b81bc16
MD5 79699f28a59924b006605e3fccd8dfc9
BLAKE2b-256 084864f83615088b2aeb01b5fca604a06cf866a6a8b09d53c70661efcc204da9

See more details on using hashes here.

Provenance

The following attestation bundles were made for dartsort-0.5.2-py3-none-any.whl:

Publisher: deploy.yml on cwindolf/dartsort

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page