DARTsort
Project description
dartsort
dartsort is a modular spike sorter built around a statistical clustering model and a new approach to probe motion. It is also a toolkit of modules for building spike sorters or other analyses of electrophysiology data.
:warning: work in progress :warning:
We do not currently recommend DARTsort for production spike sorting purposes. Please feel free to open an issue or a discussion if you run into problems.
Installation
Installing into an existing environment
If you already have a Python environment with PyTorch working and you just want to get dartsort there, use
$ pip install dartsort
If you want to run the test suite or use dartsort.vis, you can install the optional dependencies with pip install dartsort[test,vis].
If you need a Python environment, expand the next section.
Setting up a Python environment
Otherwise, there are a few ways to get Python and PyTorch set up, including new tools like uv, but I find that a conda-forge-based distribution is still the most reliable at installing the GPU dependencies which PyTorch needs (note: conda-forge is different from the non-free Anaconda).
You can use conda-forge to install Python, dartsort, and its dependencies as follows:
- Follow the
conda-forgeinstallation instructions for your platform at https://conda-forge.org/download/ - Create an environment with
$ mamba env create -f environment.yml
This will create an environment calleddartsort, but you can change the name by adding-n othername. - Activate the environment:
$ mamba activate dartsort
- Install
dartsortand the rest of its dependencies by running the pip command above
Usage
As a Python function
dartsort can be run from inside Python with:
import dartsort
dartsort.dartsort(recording, output_dir)
# or, to set configuration options, something like...
dartsort_result = dartsort.dartsort(
recording,
output_dir,
cfg=dartsort.DARTsortUserConfig(
preprocessing="ibllike",
work_in_tmpdir=True,
copy_recording_to_tmpdir=True,
),
)
Please read the important configuration details section below. Some of them, like preprocessing, are not set by default and need your input! (This could change.)
Here, recording is a SpikeInterface recording object.
SpikeInterface can read every electrophysiology data format that I've encountered and many I haven't; see
output_dir is the folder where dartsort will save its output.
Once you've run dartsort, you might want to check out the outputs and exporting section below.
Important configuration details
Before running dartsort, please be aware of the following important configuration options.
preprocessing: dartsort won't touch your data by default (preprocessing="none"), leaving you free to implement your own preprocessing in SpikeInterface or otherwise, and therefore dartsort will explode if you don't set this flag and leave your data in its original raw state (for instance, the rawint16data off the probe).- For a cheap but sensible default, try
preprocessing="ibllikecmr", which applies a pipeline similar to that of the IBL but with global median common referencing instead of their spatial highpass filter.preprocessing="ibllike"will use their spatial highpass filter.
- For a cheap but sensible default, try
do_motion_estimation=Trueby default, and you may like to disable it if you know for a fact that there is (say) less than 5 microns of total drift in your recording, or if you have handled this in your own preprocessing (which is discouraged, since dartsort has its own approach.)work_in_tmpdirandcopy_recording_to_tmpdircan be helpful in some cases where slow network drives are involved.
Outputs and exporting
The dartsort_result = dartsort(...) function returns a dictionary dartsort_result containing a DARTsortSorting object under the key sorting = dartsort_result["sorting"].
This object has all the spike train data attached (as arrays under property names .times_samples and .times_seconds for spike times in samples and seconds, .labels for unit labels, and many others; print(sorting) to see some more).
If you already ran dartsort and want to load the output spike trains, use dartsort.load(output_dir) to get the DARTsortSorting object.
This object can also export itself to other formats:
- For a SpikeInterface
NumpySortingobject, usesorting.to_numpy_sorting() - For a Pynapple
TsGroup, usesorting.to_tsgroup() - To export to Phy, we currently suggest bridging through SpikeInterface. Start with
sorting.to_numpy_sorting()and follow the instructions in SpikeInterface's documentation for first creating aSortingAnalyzerand then exporting that to Phy. - For a simple dictionary of numpy arrays, use
dict = sorting.spike_feature_dict. - For pandas, use
sorting.to_pandas().
dartsort also saves motion information, returned as dartsort_result["motion"] or loaded after the fact as dartsort.try_load_motion_info(output_dir).
The data is saved to output_dir in the following files:
dartsort_sorting.npz: This NPZ file contains the final spike train under the keystimes_samples,channels, andlabels.matching1.h5: This HDF5 file contains spike features and other data from the last matching step. Amplitudes, localizations, and other features live in here; useh5lson the command line to see what's in there. Be aware that thelabelsdataset in this HDF5 is not the same as what's saved in thedartsort_sorting.npz.motion_info.pklis a pickledMotionInfoobject.- There may be a models/ folder containing PyTorch weights files with modeling quantities (for instance, featurization SVD bases, localization neural nets, Gaussian mixture model parameters). If you want to load these up, feel free to reach out for help.
Visualization
To make some basic visualizations of the sorting result with matplotlib, try:
import dartsort.vis as dartvis
# gather outputs from dartsort
dartsort_result = dartsort(recording, output_dir, ...)
sorting = dartsort_result["sorting"]
motion = dartsort_result["motion"]
# or, if you already ran it
sorting = dartsort.load(output_dir)
motion = dartsort.try_load_motion_info(output_dir)
dartvis.visualize_sorting(
recording,
sorting,
vis_save_dir,
motion=motion,
make_unit_summaries=False,
)
Set make_unit_summaries=True to create a summary plot for each unit.
Command-line interface
Try running
$ dartsort -h
on your command line to see usage instructions; parameters can be configured on the command line or read from a TOML file.
Troubleshooting
Please let us know if you run into any issues. If you feel that the issue is a software bug, feel free to open an issue or a discussion on GitHub. If it's more of a data-related or methodology thing, feel free to use the email on my GitHub profile.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file dartsort-0.5.2.tar.gz.
File metadata
- Download URL: dartsort-0.5.2.tar.gz
- Upload date:
- Size: 4.2 MB
- Tags: Source
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
0b2ded2def894f26bdef0742303a35640fe00a4c3b4106940d2dcf590fe3355c
|
|
| MD5 |
ea31b521007fffb9660f094ca9e05a2a
|
|
| BLAKE2b-256 |
e4c4e2b43b83a397892e42b251109a1cbed4483ef253bf10b1d6d1bec1a1edc2
|
Provenance
The following attestation bundles were made for dartsort-0.5.2.tar.gz:
Publisher:
deploy.yml on cwindolf/dartsort
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dartsort-0.5.2.tar.gz -
Subject digest:
0b2ded2def894f26bdef0742303a35640fe00a4c3b4106940d2dcf590fe3355c - Sigstore transparency entry: 1538721399
- Sigstore integration time:
-
Permalink:
cwindolf/dartsort@970ec2dd797b81929479bfe7ef3d781631fc0997 -
Branch / Tag:
refs/tags/v0.5.2 - Owner: https://github.com/cwindolf
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
deploy.yml@970ec2dd797b81929479bfe7ef3d781631fc0997 -
Trigger Event:
release
-
Statement type:
File details
Details for the file dartsort-0.5.2-py3-none-any.whl.
File metadata
- Download URL: dartsort-0.5.2-py3-none-any.whl
- Upload date:
- Size: 2.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? Yes
- Uploaded via: twine/6.1.0 CPython/3.13.12
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
7f4a8d8cc49fa190b68c6bbe7492df3b2d6f7c92a75e53bc8c6b33671b81bc16
|
|
| MD5 |
79699f28a59924b006605e3fccd8dfc9
|
|
| BLAKE2b-256 |
084864f83615088b2aeb01b5fca604a06cf866a6a8b09d53c70661efcc204da9
|
Provenance
The following attestation bundles were made for dartsort-0.5.2-py3-none-any.whl:
Publisher:
deploy.yml on cwindolf/dartsort
-
Statement:
-
Statement type:
https://in-toto.io/Statement/v1 -
Predicate type:
https://docs.pypi.org/attestations/publish/v1 -
Subject name:
dartsort-0.5.2-py3-none-any.whl -
Subject digest:
7f4a8d8cc49fa190b68c6bbe7492df3b2d6f7c92a75e53bc8c6b33671b81bc16 - Sigstore transparency entry: 1538721482
- Sigstore integration time:
-
Permalink:
cwindolf/dartsort@970ec2dd797b81929479bfe7ef3d781631fc0997 -
Branch / Tag:
refs/tags/v0.5.2 - Owner: https://github.com/cwindolf
-
Access:
public
-
Token Issuer:
https://token.actions.githubusercontent.com -
Runner Environment:
github-hosted -
Publication workflow:
deploy.yml@970ec2dd797b81929479bfe7ef3d781631fc0997 -
Trigger Event:
release
-
Statement type: