Automated behavioral event detection in bio-logging data.
Project description
stickleback
A machine learning pipeline for detecting fine-scale behavioral events in bio-logging data.
Installation
Install with pip.
pip install stickleback
Key concepts
- Behavioral events are brief behaviors that can be represented as a point in time, e.g. feeding or social interactions.
- High-resolution bio-logging data (e.g. from accelerometers and magnetometers) are multi-variate time series. Traditional classifiers struggle with time series data.
stickleback
takes a time series classification approach to detect behavioral events in longitudinal bio-logging data.
Quick start
Load sample data
The included sensor data contains the depth, pitch, roll, and speed of six blue whales at 10 Hz, and the event data contains the times of lunge-feeding behaviors.
import pandas as pd
import sktime.classification.interval_based
import sktime.classification.compose
from stickleback.stickleback import Stickleback
import stickleback.data
import stickleback.util
import stickleback.visualize
# Load sample data
sensors, events = stickleback.data.load_lunges()
# Split into test and train (3 deployments each)
def split_dict(d, ks):
dict1 = {k: v for k, v in d.items() if k in ks}
dict2 = {k: v for k, v in d.items() if k not in ks}
return dict1, dict2
test_deployids = list(sensors.keys())[0:2]
sensors_test, sensors_train = split_dict(sensors, test_deployids)
events_test, events_train = split_dict(events, test_deployids)
sensors[test_deployids[0]]
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
depth | pitch | roll | speed | |
---|---|---|---|---|
datetime | ||||
2018-09-05 11:55:52.400 | 14.911083 | -0.059933 | -0.012899 | 4.274450 |
2018-09-05 11:55:52.500 | 14.910864 | -0.067072 | -0.010815 | 4.044154 |
2018-09-05 11:55:52.600 | 14.915853 | -0.075173 | -0.008335 | 3.820568 |
2018-09-05 11:55:52.700 | 14.923190 | -0.085225 | -0.005727 | 3.602702 |
2018-09-05 11:55:52.800 | 14.928955 | -0.096173 | -0.002803 | 3.432342 |
... | ... | ... | ... | ... |
2018-09-05 13:55:51.900 | 22.552306 | -0.010861 | 0.005441 | 2.246061 |
2018-09-05 13:55:52.000 | 22.571625 | -0.010534 | 0.004674 | 2.257525 |
2018-09-05 13:55:52.100 | 22.588129 | -0.010081 | 0.003841 | 2.267966 |
2018-09-05 13:55:52.200 | 22.603341 | -0.009627 | 0.003042 | 2.272327 |
2018-09-05 13:55:52.300 | 22.619537 | -0.009355 | 0.002164 | 2.277328 |
72000 rows × 4 columns
Visualize sensor and event data
plot_sensors_events()
produces an interactive figure for exploring bio-logger data.
# Choose one deployment to visualize
deployid = list(sensors.keys())[0]
stickleback.visualize.plot_sensors_events(deployid, sensors, events)
Define model
Initialize a Stickleback
model using Supervised Time Series Forests and a 10 s window.
# Supervised Time Series Forests ensembled across the columns of `sensors`
cols = sensors[list(sensors.keys())[0]].columns
tsc = sktime.classification.interval_based.SupervisedTimeSeriesForest(n_estimators=2,
random_state=4321)
stsf = sktime.classification.compose.ColumnEnsembleClassifier(
estimators = [('STSF_{}'.format(col),
tsc,
[i])
for i, col in enumerate(cols)]
)
sb = Stickleback(
local_clf=stsf,
win_size=50,
tol=pd.Timedelta("5s"),
nth=10,
n_folds=4,
seed=1234
)
Fit model
Fit the Stickleback
object to the training data.
sb.fit(sensors_train, events_train)
Test model
Use the fitted Stickleback
model to predict occurence of lunge-feeding events in the test dataset.
predictions = sb.predict(sensors_test)
Assess results
Use the temporal tolerance (in this example, 5 s) to assess model predictions. Visualize with an outcome table and an interactive visualization. In the figure, blue = true positive, hollow red = false negative, and solid red = false positive.
outcomes = sb.assess(predictions, events_test)
stickleback.visualize.outcome_table(outcomes, sensors_test)
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
</style>
F1 | TP | FP | FN | Duration (hours) | |
---|---|---|---|---|---|
deployid | |||||
bw180905-49 | 1.000000 | 44 | 0 | 0 | 1.999972 |
bw180905-53 | 0.943396 | 25 | 2 | 1 | 1.999972 |
deployid = list(events_test.keys())[0]
stickleback.visualize.plot_predictions(deployid,
sensors_test,
predictions,
outcomes)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file stickleback-0.1.2.tar.gz
.
File metadata
- Download URL: stickleback-0.1.2.tar.gz
- Upload date:
- Size: 13.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 00aff52862b042b4d23835f64a1cea86cdfc226aabc2b8f902e6866b054e3e5d |
|
MD5 | 313c2d14a04e739ebc9b005fc825f0f2 |
|
BLAKE2b-256 | 10e2db4404d92764a781557ff778b62c3920063a5d1dfa6ab78652f6a63cdacb |
File details
Details for the file stickleback-0.1.2-py3-none-any.whl
.
File metadata
- Download URL: stickleback-0.1.2-py3-none-any.whl
- Upload date:
- Size: 15.5 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.2 importlib_metadata/4.8.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.11
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9f06a2eb6201a68590a8619ad0d9b7e3e58a209301c2150f91cabd2f689afc2c |
|
MD5 | c7f36b03bf3ba9c1ad43209fecfdfa4a |
|
BLAKE2b-256 | 47804d78fa2c84c12ef4dfdc12b02ebf941f06a9660b0c1b2fa021db106f72c5 |