Sensor data acquisition library.
Project description
PyDSlog
PyDSlog claims to make the data acquisition for machine learning and AI applications on the RMG/941 fast and easy.
Installed by default on the RMG/941. It allows to read the sensor values coming from the serial RS485 interface or via MQTT and stores them in CSV files.
Installation
pip install pydslog
Access to stream
To use the stream it is necessary to initialize the class with the desired frequency, the desired channels, the serial port and the size of the block to be read.
For the MLS/160A:
x = PyDSlog.stream.MLS160A_stream(sz_block=500,
channels_to_use=["ACCX", "ACCY", "ACCZ", "GYRX", "GYRY", "GYRZ"],
frequency=500, port="COM15", baudrate=115200)
and for the IO5640-DS:
x = PyDSlog.stream.IO5640_stream(sz_block=100,
channels_to_use=["AI4U", "AI3U", "AI2U", "AI1U", "AI1I", "AI2I"],
frequency=500, port="COM15", baudrate=115200)
Where:
- sz_block: is the size of the block to be read at once when the stream is read.
- channels_to_use: are the channels to use. Possible are
["ACCX", "ACCY", "ACCZ", "GYRX", "GYRY", "GYRZ"]
for the MLS/160A and["AI4U", "AI3U", "AI2U", "AI1U", "AI1I", "AI2I"]
for the IO5640-DS. - frequency: is the desired frequency and
port
is the serial port. the baudrate is fixed at 115200 and can not be changed.
To access the stream directly it is necessary to connect the sensor to the serial port. The connect()
method is used
for this purpose. To start the stream the start()
method is used. After calling the start()
function, the sensor will
start sending the values of the selected channels with the desired frequency. To read the values, use the read(transpose=False)
method.
Finally to stop the stream you use the stop()
method and to release the serial port disconnect()
.
As an example, using MLS/160A:
# import PyDSlog
import PyDSlog.stream as stream
# size of the block to read is 500
# the channels that will be streamed are ["ACCX","ACCY","ACCZ","GYRX","GYRY","GYRZ"]
# a frequency of 500 Hz
# sensor is connected to port COM15
# the baudrate is fixed at 115200 and can not be changed.
x = stream.MLS160A_stream(sz_block=500, channels_to_use=["ACCX","ACCY","ACCZ","GYRX","GYRY","GYRZ"],
frequency=500, port="COM15", baudrate=115200, n_frame=100)
try:
# connect
x.connect()
# start
x.start()
# read stream. if you want to to transpose the values use transpose = True.
r = x.read(transpose=False)
finally:
# stop
x.stop()
# disconnect from port
x.disconnect()
Generate CSV file from sensor values
PyDSlog was developed for the purpose of generating csv files to be used for training Machine Learning algorithms.
The following classes are available for this purpose:
For the MLS/160A:
x = PyDSlog.csv.MLS160A_csv_saver(port, channels_to_use, frequency, block_size,
filepath, filename=None, labeled=False, save_as_signal=False,
header=True, custom_header=None, add_tmp=None, date_format="%d/%m/%Y,%H:%M:%S",
baudrate=115200, w_mode="a", delimiter=",")
and for the IO5640-DS:
x = PyDSlog.csv.IO5640_csv_saver(port, channels_to_use, frequency, block_size,
filepath, filename=None, labeled=False, save_as_signal=False,
header=True, custom_header=None, add_tmp=None, date_format="%d/%m/%Y,%H:%M:%S",
baudrate=115200, w_mode="a", delimiter=",")
Where:
sz_block
is the size of the block to be read at once when the stream is read.
channels_to_use
are the channels to use. Possible are ["ACCX", "ACCY", "ACCZ", "GYRX", "GYRY", "GYRZ"]
for
the MLS/160A and ["AI4U", "AI3U", "AI2U", "AI1U", "AI1I", "AI2I"]
for the IO5640-DS.
frequency
is the desired frequency and port
is the serial port. the baudrate is fixed at 115200 and can not
be changed.
- Labeled: default False. boolean.
If the sensor values have to be labelled (in order to be used with Supervised Learning), it is necessary to use
labeled=True
. - filepath: default None. string pointing to the location of the csv file. filename: default None. string with name of file.
- save_as_signal: default False. boolean.
If the values have to be interpreted as signals, it is necessary to use
save_as_signal=True
. This makes it possible to use an FFT function later. A separate csv file is generated for each channel. The values that represent a signal are separated by commas. A newline character separates the signals from each other. Each signal has a length of sz_block values. - header: default True. boolean. If true, the name of the channel is used as the header
- custom_header: default None. string In case a different header than the name of the channels is needed.
- add_tmp: default None. also possible: date, us or ms.
In case you want to use a timestamp. for
add_tmp="date"
the format indate_format
is used. - date_format: default
"%d/%m/%Y,%H:%M:%S"
The format for the date used. only necessary ifadd_tmp="date"
- delimiter: "," The separator character used.
Once the class to generate the CSV file is initialized it is possible to use the start()
method to start saving the
sensor values into the CSV file. The pause()
method can be used to pause. This method stops the sensor stream but does
not close the CSV file or release the serial port. To reactivate the stream and save more values in the file, the start()
method can be used again. Finally, the stop()
method is used to end the recording.
If the values are labeled it is possible to use the set_label(label)
method to configure the label to be used. The
label
parameter is a number that represents the label with which the values are stored in the file.
As an example, using MLS/160A:
# import libraries
import PyDSlog.csv as csv
import time
# initialize
x = csv.MLS160A_csv_saver(port="COM3", channels_to_use=["ACCX","ACCY","ACCZ"], frequency=500,
block_size=500, filepath="test/", filename="file.csv",
labeled=False, save_as_signal=False, header=True,
add_tmp="ms", baudrate=115200, w_mode="a")
# start
x.start_csv()
# wait..
time.sleep(3)
#pause
x.pause_csv()
#wait again
time.sleep(3)
# restart
x.start_csv()
# wait again..
time.sleep(3)
# terminate
x.stop_csv()
The output will be something like:
Fourier Transformation
PyDSlog has a small class included to help transform the signals into the frequency spectrum.
Dependencies: you have to install numpy and scipy before you can use this module
import PyDSlog.transform as fft
import PyDSlog.stream as stream
import matplotlib.pyplot as plt
import numpy as np
chan = ["ACCX","ACCY","ACCZ"]
serial_port = "COM6"
frequency = 400
size_signal = 400
period = 1.0/frequency
# initialize
transform = fft.FFTGenerator(period, size_signal, frequency)
x = stream.MLS160A_stream(sz_block=size_signal, channels_to_use=chan,
frequency=frequency, port=serial_port, baudrate=115200)
# read values as signal
sensor_values = x.read(transpose=False)
sensor_values = np.array(sensor_values)
# transform
ffts_val = transform.doFFT(sensor_values, delete_offset=True)
# where ffts_val[0,c,:,0] are frequencies and ffts_val[0,c,:,1] are the amplitudes
# plot
plt.plot(ffts_val[0,c,:,0], ffts_val[0,c,:,1], linestyle='-', label="FFT")
plt.show()
PyDSlog simple signal classifier
PyDSlog includes a simple signal classifier. This classifier consists in forming groups where a master vector is stored according to its correlation. During each learning cycle the correlation of the new vector with the master vector of each group is compared. It is assumed that the master vector in the group with the highest correlation to the new vector is the group to which the new vector belongs. Once the group is found, a new master vector is calculated and stored in the group to be used in future operations.
Once the learning process has been completed, it is possible to predict which group a new vector belongs to by calculating the average distance to the master vector from the group with the highest correlation.
Dependencies: you have to install numpy and scipy before you can use this module. For advanced functionality it is also recommended to install scikit-learn
The class:
PyDSlog.classificator.SignalClassificator(min_pears_correlation=0.7, max_subgroups=5, outliers=False)
where:
- min_pears_correlation: Default is 0.7. float. is the minimum pearson correlation to be used to group a new vector to a group.
- max_subgroups: Default is 5. integer. is the maximum quantity of groups generated while learning.
- outliers: Default False. Boolean.
If the average distance found when predicting a class for a new vector is lower than the tolerance (see
predict(signals, tolerance)
), then return -1 as class indicating that it is an unknown class or an anomaly
To train the algorithm it is necessary to use the fit( x_train, y_train)
method. Where x_train
are the signals
and y_train
are the labels.
To make a prediction it is necessary to use the predict(signals, tolerance)
method. Where signals is the vector to be
classified and the tolerance is a number that indicates the tolerance to be used. If the average distance (or difference)
of the new vector from the master vector within each group is greater than the tolerance, then the vector belongs to an
unknown class or is an anomaly (-1).
As an example:
import PyDSlog.classificator as classificator
import PyDSlog.transform as transform
import pandas as pd
import numpy as np
################## STREAM DATA PARAMETERS #################
N = 5000
fs = 500
T = 1 / fs
PREFIX = "1478217877058"
X_FILES = [PREFIX+"_x_ACCX_.csv",PREFIX+"_x_ACCY_.csv",PREFIX+"_x_ACCZ_.csv",
PREFIX+"_x_GYRX_.csv",PREFIX+"_x_GYRY_.csv",PREFIX+"_x_GYRZ_.csv"]
Y_FILE = PREFIX+"_y_.csv"
def read_signals(name):
r = pd.read_csv(name, header=None, index_col=None)
return r
signals = []
for file in X_FILES:
s = np.array(read_signals("../test/test/"+file))
signals.append(s)
signals = np.transpose(np.array(signals), (1, 0, 2))
labels = np.array(pd.read_csv("../test/test/"+Y_FILE, header=None, index_col=None))
labels = np.squeeze(labels)
t = transform.FFTGenerator(T, N, fs)
v_ffts = t.doFFT(signals, delete_offset=True)
print(v_ffts.shape)
################## TRAIN TEST SPLIT #################
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(v_ffts[:,:,:,1], labels, test_size=0.4)
cls = classificator.SignalClassificator()
cls.fit(x_train, y_train)
y_pred = cls.predict(x_test, 4.5, verbose=True)
from sklearn.metrics import classification_report
print(classification_report(y_test, y_pred))
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.