CLI software for the detection of the single target sound event of bicycle bell signals.
Project description
Bicycle Bell Sound Event Detection System
Author: Clemens Kubach
This repository is one of three for my bachelor thesis on "Development of an Embedded System for Detecting Acoustic Alert Signals of Cyclists Using Neural Networks".
It contains the software as an easy-to-use and customizable CLI for the project. Only this part has to be installed on the target device and can be used and developed independently of the other components.
A trained saved model can be selected, which is then converted to an inference format (TFLite or TF-TRT), allowing real-time predictions to be made to a single sound event for live-streamed audio via connected sound devices.
The other related repositories are:
Getting Started
The software is based on the PortAudio library for audio I/O. Therefore, this must be installed on the system. For more detailed installation instructions on an embedded device like the Jetson Nano, see the corresponding chapter.
apt-get update
apt-get install portaudio19-dev
pip install --upgrade pip
pip install bicycle-bell-seds-cli
seds-cli --help
seds-cli run --tfmodel='!crnn' production
There are generally 4 main functionalities that are displayed with seds-cli --help
.
conversion
can convert recordings of a pre-executed run command with appropriate parameterization for sound recording to a wave file.devices
can be used for testing the available devices by doing a sound check.resources
can be used for find the location of resource files like log-files or recordings.run
is the main functionality of the software. This command is used to start a sound event detection.
General Information
Generally, two versions of the CLI are installed: jn-seds-cli
and seds-cli
.
The first one is based on the second one and only contains simplifications and specifications for
the execution of the bicycle bell detection on the Jetson Nano.
With the right choice of parameters, however,
both CLIs can be used on all devices without any problems.
Details about the differences can be found via jn-seds-cli run --help
. In the following, the
jn-seds-cli
version will be used for an easier copy-and-paste usage on the Jetson Nano as
target device.
Please use --help
for detailed explanations of the individual software
functionalities and parameters.
With this you can get help for each level, i.e.:
jn-seds-cli --help
, jn-seds-cli run --help
, jn-seds-cli run evaluation --help
.
Usage Examples
Show the location of the resources' folder:
jn-seds-cli resources where
Make a sound check of the audio devices:
jn-seds-cli devices soundcheck
Start a sound event detection with saving the logs to a file in the resources' folder and record the first minute of the received audio input stream:
jn-seds-cli run --tfmodel='!crnn' production --save_log=True --save_records=True --storage_length=60
Convert the recorded file of the previous run into a wave file:
jn-seds-cli conversion record_to_wav --path_storage_pickle="/abs/path/to/seds_cli/res/records/record-xx.pickle" --target_wav_path="./target_filepath/filename.wav"
Run Command
There are two different modes with the run command: production and evaluation. The production mode is the main mode and receives live the current sound of the environment through the selected microphone device. The evaluation mode can play a recorded wave file with a corresponding annotation csv file and displays the ground-truth value as well as the prediction for the live microphone recordings.
Most parameters for the run command are available for both modes.
Mode specific parameters can be found via --help
for the selected mode.
The following flags are used for the production mode, but are available for the evaluation mode too.
There are three models pre-defined and pre-trained for direct usage for
detecting bicycle bell sounds. They can be chosen by using --tfmodel="!model-name"
without any
further specifications of saved_model
type or the absolute path to the saved model resource
in tfmodel
. Available are !crnn
, yamnet_base
and yamnet_extended
.
Select the predefined CRNN model via !crnn
, and run the production mode without displaying
the probability value of the predictions. The logs will be saved to a file:
jn-seds-cli run --tfmodel='!crnn' production --save_log=True --prob_logging=False
Via specifying an integer for input_device
, not None
, a specific (not default) sound device
can be selected.
Use the extended YAMNet model in a production run and define that the selected (default)
input device only has one input channel:
jn-seds-cli run --tfmodel='!yamnet_extended' production --channels=1
Use the base YAMNet model in a production run with a lower threshold as default and activate the logging of the probabilities to see that every prediction from a value of 0.3 will True:
jn-seds-cli run --tfmodel='!yamnet_base' production --threshold=0.3 --prob_logging=True
Mode specific flag-usage examples...
For the evaluation mode, use the first option for a random test example or specify an own test:
jn-seds-cli run --tfmodel='!crnn' evaluation --save_log=True --silent=False
# Or
jn-seds-cli run --tfmodel='!crnn' evaluation --save_log=True --wav_file="/path/to/wave.wav" --annotation_file="/path/to/annotations.csv" --silent=False
For playback:
jn-seds-cli run --tfmodel='!crnn' production --save_log=True --use_output=True
Advanced Usage
Please note that in order to use the gpu, an appropriately compatible TensorFlow build must be
installed and used with --gpu=True
.
In addition, the inference model type must be set to use
TensorFlow-TenorRT via --infer_model=tftrt
, depending on the specific machine.
In some cases, also TFLite can be used with gpu support.
Unfortunately, TF-TRT could not yet be tested thoroughly because compatible devices or
software dependencies were not available.
For further information, read run --help
under the related parameters.
New trained models can be used via --tfmodel="/path/to/tf-savedmodel-dir savedmodel=crnn
.
The best way, is to modify the source code in the file saved_models.py
and create a new child class of BaseSavedModel or Mono16kWaveInputSavedModel.
Then create an entry for this class in selectors.py, thus custom preprocessing and postprocessing for the model can be defined.
An easier way without modifying code is to use a saved model with the currently support interface of mono 16 kHz waveform input.
If so, it can easily be used via --tfmodel="/path/to/tf-savedmodel-dir --saved_model=MONO_16K_IN
.
Please note that this feature has not yet been thoroughly tested.
Jetson Nano Setup
The following explanation based on the latest stable version of the JetPack OS (4.6.1) for the Jetson Nano at the time of writing. The use of the future version JetPack 5.0 is expected to resolve the installation issues with Tensorflow_io and thus possibly allow support of the GPU and Tensorflow-TRT on the Jetson Nano. However, without development intentions, the here documented version 4.6.1 of JetPack should be used for now.
Make sure that JetPack 4.6.1 has been installed!
apt-get update
apt-get install portaudio19-dev
sudo apt-get update
sudo apt-get install libhdf5-serial-dev hdf5-tools libhdf5-dev zlib1g-dev zip libjpeg8-dev liblapack-dev libblas-dev gfortran
sudo apt-get install python3-pip
mkdir ./venv
python3 -m venv ./venv
source venv/bin/activate
pip install -U pip testresources setuptools==49.6.0
pip install -U --no-deps numpy==1.19.4 future==0.18.2 mock==3.0.5 keras_preprocessing==1.1.2 keras_applications==1.0.8 gast==0.4.0 protobuf pybind11 cython pkgconfig
env H5PY_SETUP_REQUIRES=0 pip install -U --no-build-isolation h5py==3.1.0
pip install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v461 'tensorflow>=2'
python -c "from tensorflow.python.client import device_lib; device_lib.list_local_devices()"
PIP_EXTRA_INDEX_URL=https://snapshots.linaro.org/ldcg/python-cache pip install tensorflow_io
python -c "from tensorflow.python.client import device_lib; device_lib.list_local_devices()"
pip install bicycle-bell-seds-cli
jn-seds-cli --help
jn-seds-cli run --tfmodel='!crnn' production
It is expected that after tensorflow_io
installation no gpu will be detected.
This is because this build of tensorflow_io
brings a specific build of tensorflow
(2.6) that does not
support gpus'.
With JetPack 5.0 and thus higher Python version (>3.6), more recent versions of tensorflow_io
can
be installed directly, for which there are also pre-build wheels for aarch64.
Most of the installation steps for TensorFlow on the Jetson Nano are from "Prerequisites and Dependencies" in corresponding the Nvidia Docs.
Using Docker can save some on setup steps, but can also add some others. If Docker should be used on the Jetson Nano:
sudo docker run -it --rm --runtime nvidia --network host -v /home/jetson:/home/jetson --device /dev/snd nvcr.io/nvidia/l4t-tensorflow:r32.7.1-tf2.7-py3
Development
Feel free to report bugs as issues and also contribute to the project. Please contact me for this. Especially the integration of new models and the full and tested integration of TF-TRT are still outstanding points of improvement. In addition, the SEDS-CLI will be offered completely separate from the bicycle bell sound event in a further step and repository.
Use the following steps to install directly from the GitHub repository and
do not forget to call git lfs pull
before running.
This will download the model data.
apt-get update
apt-get install portaudio19-dev git git-lfs
git clone https://github.com/ClemensKubach/bicycle-bell-sed-software.git
cd bicycle-bell-sed-software
git lfs pull
pip install -e .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distributions
Built Distribution
File details
Details for the file bicycle_bell_seds_cli-0.0.3-py3-none-any.whl
.
File metadata
- Download URL: bicycle_bell_seds_cli-0.0.3-py3-none-any.whl
- Upload date:
- Size: 32.1 MB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.0 CPython/3.9.7
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 540bf1f53a69ca5f20ce1b328ae0e80eb2616266bb7989a2c6d9e11909adc77b |
|
MD5 | d5c56d0215e6ddb79042dce5f7c2cbed |
|
BLAKE2b-256 | 3a055331e33d165baccd250021b95f6d35148db2c9b1d1aac56e36ae1f6914d5 |