Accompanying library to the AURSAD dataset
Project description
AURSAD
A python library for the AURSAD dataset. Detailed description of the dataset and download.
The library contains several useful functionalities for preprocessing the dataset for ML applications:
- Creating numpy training and test datasets for sampled data
- Creating Keras TimeSeries generators
for sliding window data
- Prediction and classification mode
- In the default prediction mode, the target label of window is the label of the next sample. This can be used to train a sliding window model that predicts the class of the next sample based on the window.
- In the classification mode, the target label of a window is the most common label in that window.
- Prediction and classification mode
- Filtering the dataset
- Removing undesired columns as outlined in the paper
- 3 different types of labeling
- Full sample labeling where loosening and tightening motions are labeled together
- Separate sample labeling where loosening motion is given its own label
- 'Tighten' sample labeling, when only the tightening parts of the whole process are labeled as normal/anomalies, loosening and movement parts of the motion get its own separate labels
- Binary labels -> every anomaly is given the same label
- Subsampling the data
- Dimensionality reduction using PCA or ANOVA F-values
- One-hot label encoding
- Zero padding the samples to equalise their length
- Z-score standardisation
- Taking data only from screwdriver sensors
Dataset
The dataset contains 2045 samples in total. The robot was sampled with frequency of 100 Hz.
Type | Label | Samples | % |
---|---|---|---|
Normal operation | 0 | 1420 | 69 |
Damaged screw | 1 | 221 | 11 |
Extra assembly component | 2 | 183 | 9 |
Missing screw | 3 | 218 | 11 |
Damaged thread samples | 4 | 3 | 0 |
Additionally, there are 2049 supplementary samples describing the loosening/screw picking motion, labeled 5.
Installation
AURSAD has been tested on Windows 10 and Python 3.8.
PIP installation
To install from pip with required dependencies use:
pip install aursad
Source installation
To install latest version from github, clone the source from the project repository and install with setup.py:
git clone https://github.com/CptPirx/robo-package
cd robo-package
python setup.py install --user
Instructions
The package presents to user two methods: get_dataset_numpy() and get_dataset_generator().
Sampling
def get_dataset_numpy(path, onehot_labels=True, reduce_dimensionality=False, reduce_method='PCA', n_dimensions=60,
subsample_data=True, subsample_freq=2, train_size=0.7, random_state=42, normal_samples=1,
damaged_samples=1, assembly_samples=1, missing_samples=1, damaged_thread_samples=0,
loosening_samples=1, move_samples=1, drop_extra_columns=True, pad_data=True,
label_type='partial', binary_labels=False, standardize=False, screwdriver_only=False):
"""
Create numpy dataset from input file
:param assembly_samples: float,
percentage of extra assembly samples to take
:param binary_labels: bool,
if True all anomalies are labeled the same
:param damaged_samples: float,
percentage of damaged samples to take
:param damaged_thread_samples: float,
percentage of damaged thread samples to take
:param drop_extra_columns: bool,
drop the extra columns as outlined in the paper
:param label_type: string,
'full', 'partial' or 'tighten'
:param loosening_samples: float,
percentage of loosening samples to take
:param missing_samples: float,
percentage of missing samples to take
:param move_samples: float,
percentage of movement samples to take
:param n_dimensions: int,
the target number of dimensions
:param normal_samples: float,
percentage of normal samples to take
:param onehot_labels: bool,
output onehot encoded labels
:param pad_data: bool,
if True pad data to equal length samples, if False return data in continuous form
:param path: path to the data
:param random_state: int,
random state for train_test split
:param reduce_dimensionality: bool,
reduce dimensionality of the dataset
:param reduce_method: string,
dimensionality reduction method to be used
:param screwdriver_only: bool,
take only the 4 dimensions from the screwdriver sensors
:param standardize: bool,
if True apply z-score standardisation
:param subsample_data: bool,
reduce number of events by taking every subsample_freq event
:param subsample_freq: int,
the frequency of subsampling
:param train_size: float,
percentage of data as training data
:return: 4 np arrays,
train and test data & labels
"""
Sample usage:
import aursad
data_path = 'C:/Users/my_path/robot_data.h5'
train_x, train_y, test_x, test_y = aursad.get_dataset_numpy(data_path)
Sliding window
def get_dataset_generator(path, window_size=100, reduce_dimensionality=False, reduce_method='PCA', n_dimensions=60,
subsample_data=True, subsample_freq=2, train_size=0.7, random_state=42, normal_samples=1,
damaged_samples=1, assembly_samples=1, missing_samples=1, damaged_thread_samples=0,
loosening_samples=1, move_samples=1,drop_extra_columns=True, label_type='partial',
batch_size=256, binary_labels=False, standardize=False, screwdriver_only=False,
onehot_labels=True):
"""
Create Keras sliding window generator from input file
:param assembly_samples: float,
percentage of extra assembly samples to take
:param batch_size: int,
batch size for the sliding window generator
:param binary_labels: bool,
if True all anomalies are labeled the same
:param damaged_samples: float,
percentage of damaged samples to take
:param damaged_thread_samples: float,
percentage of damaged thread samples to take
:param drop_extra_columns: bool,
drop the extra columns as outlined in the paper
:param label_type: string,
'full', 'partial' or 'tighten'
:param loosening_samples: float,
percentage of loosening samples to take
:param missing_samples: float,
percentage of missing samples to take
:param move_samples: float,
percentage of movement samples to take
:param n_dimensions: int,
the target number of dimensions
:param normal_samples: float,
percentage of normal samples to take
:param onehot_labels: bool,
output onehot encoded labels
:param path: path to the data
:param prediction_mode: bool,
if True the target of a window [x_0, x_100] is label of x_101, if False, the target is the most common label in [x_0, x_100]
:param random_state: int,
random state for train_test split
:param reduce_dimensionality: bool,
reduce dimensionality of the dataset
:param reduce_method: string,
dimensionality reduction method to be used
:param screwdriver_only: bool,
take only the 4 dimensions from the screwdriver sensors
:param standardize: bool,
if True apply z-score standardisation
:param subsample_data: bool,
reduce number of events by taking every subsample_freq event
:param subsample_freq: int,
the frequency of subsampling
:param train_size: float,
percentage of data as training data
:param window_size: int,
size of the sliding window
:return: 4 np arrays,
train and test data & labels
:return: keras TimeSeries generators,
train and test generators
"""
Sample usage:
import aursad
data_path = 'C:/Users/my_path/robot_data.h5'
train_x, train_y, test_x, test_y, train_generator, test_generator = aursad.get_dataset_generator(data_path)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file aursad-0.2.4.tar.gz
.
File metadata
- Download URL: aursad-0.2.4.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9ea87f338d64d7d3f059f65d0a2399130185220e5850ca7cf814c74fce42497d |
|
MD5 | 6aea80671b51fe8a728f9656134a57de |
|
BLAKE2b-256 | a642f7d942648fc0f8590c76a54a17862fcdef38fa5f62cce76b8a0fb9f37052 |
File details
Details for the file aursad-0.2.4-py3-none-any.whl
.
File metadata
- Download URL: aursad-0.2.4-py3-none-any.whl
- Upload date:
- Size: 11.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.24.0 setuptools/52.0.0.post20210125 requests-toolbelt/0.9.1 tqdm/4.51.0 CPython/3.8.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2ea65158a26b5bfe2a86ba3013dcb8833cc7567481d55089ef7c94893387b77e |
|
MD5 | 7015cbdf15101720c2602e51f440c0b1 |
|
BLAKE2b-256 | 79e403ffd3f69cd4456d685e1a22dbb80bd5c06b1a48040d44d04dc2952503db |