Add your description here
Project description
pydataknot
Audio feature selection, model training, and hyperparameter optimization for Max/MSP DataKnot classifiers.
Installation
Requires Python >= 3.9
Recommend creating a virtual python environment:
python3 -m venv venv
source venv/bin/activate
Install via pip:
pip install --upgrade pip
pip install pydataknot
If you're performing hyperparameter optimization and want to use the Optuna Dashboard to inspect results:
pip install "pydataknot[dashboard]"
Usage
pydataknot includes a set of CLI tools that take as input json files produced from dk.classcreate in Max/MSP. To start, record a dataset using dk.classcreate and write to a json file.
Feature Selection
Select a subset of audio features for classification based on Minimum Redundancy Maximum Relevance (mRMR). The goal is to select features that correlate highly with classes (maximum relevancy) and have low correlation with other features (minimum redundancy).
This command will select 12 features from the dataset in dataset.json:
pydk-select data=dataset.json num_features=12
This will copy the dataset.json file, add the selected feature indices, and save this copy in an output directory outputs/feature_selection/{current-date}/{current-time}/.
Model Training
Train MLP classifiers using PyTorch.
This command will train a MLP on the dataset in dataset.json, the MLP will have two hidden layers each with 16 neurons (note the quotes around the layers list) and will be trained for 500 epochs:
pydk-train data=dataset.json mlp.hidden_layers="[16,16]" mlp.max_iters=500
Note: If you use the json that output from the feature selection step, training will use that subset of selected features.
Similar to feature selection, this will copy the input json file and save a copy in an output directory outputs/feature_selection/{current-date}/{current-time}/ with the trained model. dk.classmatch can load this json file and will automatically know to use the model trained here.
Run pydk-train --help to see a list of all possible arguments or check out flucoma-torch for more detailed information on arguments.
Hyperparameter Search
Perform a search over MLP parameters to find an architecture and training parameters optimized to your dataset. This uses optuna under the hood.
This command will perform a hyperparameter search consisting of 1000 trials (1000 different MLPs will be trained) where each MLP will be trained for 100 epochs.
pydk-optimize data=dataset.json mlp.max_iter=100 n_trials=1000
This could take several hours to complete.
The best resulting model and associated hyperparameters will be saved in the output directory outputs/feature_selection/{current-date}/{current-time}/. dk.classmatch can load the model file and will automatically know to use the model trained here.
See flucoma-torch for a
more detailed listing of possible arguments.
Feature Optimization
The number of input features can also be included for optimization by setting optimize_features=true. This will perform mRMR feature selection with different values for num_features to test which leads to the best performing model.
pydk-optimize data=dataset.json mlp.max_iter=100 n_trials=1000 optimize_features=true
Optuna Dashboard
To view all results in a web dashboard you can use the optuna dashboard. This is an optional dependency, to install it:
pip install "pydataknot[dashboard]"
During an optimizing study the results are stored in a database file (sqlite3) in the output directory corresponding to the study. Whether or not a sqlite3 file is saved depends on the argument sqlite,
which defaults to true.
For example:
optuna-dashboard sqlite:///outputs/optimize_classifier/2025-09-25/19-58-57/classifier_study.sqlite3
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pydataknot-0.0.3.tar.gz.
File metadata
- Download URL: pydataknot-0.0.3.tar.gz
- Upload date:
- Size: 336.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
4d3cf1f28367681616273819765d206ab2799190b36acbad15ea176f01534a42
|
|
| MD5 |
f0452f4af2cdf651863f155412f2e7a2
|
|
| BLAKE2b-256 |
3bded4b9b6473eca418ad0245025ab717819c2da80fd240d17047cc1ff0c7b7c
|
File details
Details for the file pydataknot-0.0.3-py3-none-any.whl.
File metadata
- Download URL: pydataknot-0.0.3-py3-none-any.whl
- Upload date:
- Size: 12.0 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.4.20
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
865eb3cde0d252a9c931e972eb5694198649ad4d05a7583dd68b016c604bf410
|
|
| MD5 |
ed748be5d909f312c90b82f0e53c8927
|
|
| BLAKE2b-256 |
bb51166440c1a41c0d672ba6c1f209030c8ddafe31c05ddad864f4ec5beb9b08
|