Decode CAN BLF logs using DBC files into pandas DataFrames and export to CSV
Project description
canml: A Python Library for Preparing CAN Bus Data for Machine Learning
Description
canml is a Python library designed to facilitate the preparation of Controller Area Network (CAN) bus data for machine learning applications. It provides tools to parse BLF (Bus Log Format) files using CAN.DBC files, preprocess the data, extract features, and export it in formats suitable for machine learning, such as CSV, Excel, and pandas DataFrames. Additionally, it offers visualization tools to aid in data exploration and understanding.
Why canml?
While libraries like cantools and python-can excel in parsing and decoding CAN bus data, they do not focus on preparing this data for machine learning workflows. canml fills this gap by providing specialized functions for:
- Preprocessing time-series CAN data (e.g., handling missing values, resampling).
- Extracting features relevant for machine learning (e.g., statistical summaries, frequency-domain features).
- Exporting data in machine learning-ready formats.
- Integrating with popular machine learning libraries like scikit-learn.
This makes canml particularly useful for engineers and data scientists working on applications such as anomaly detection, predictive maintenance, and driver behavior analysis in automotive and industrial settings.
Features
- Parse BLF files using CAN.DBC files to decode CAN messages into meaningful signals.
- Preprocess data:
- Handle missing values with interpolation or forward fill.
- Resample time-series data to a uniform grid.
- Normalize or standardize signals.
- Extract features:
- Statistical features (mean, standard deviation, min, max) over sliding windows.
- Frequency-domain features (FFT, power spectral density).
- Custom feature engineering based on domain knowledge.
- Export data to:
- CSV and Excel for reporting and sharing.
- pandas DataFrames and NumPy arrays for machine learning.
- Visualization:
- Time-series plots of signals.
- Histograms and correlation matrices for data exploration.
- Machine learning integration:
- Scikit-learn-compatible API for use in ML pipelines.
- High-level functions for common tasks like anomaly detection.
Installation
To install canml, use pip:
pip install canml
Dependencies:
- Python 3.7+
- pandas
- NumPy
- scikit-learn
- matplotlib
- cantools (optional, for enhanced BLF parsing)
Usage
Below is an example of how to use canml to load a BLF file, preprocess the data, extract features, and export it to CSV.
import canml
import pandas as pd
# Load BLF and DBC files
data = canml.io.load_blf('data.blf', 'config.dbc')
# Preprocess data
data_clean = canml.preprocess.handle_missing(data, method='interpolate')
data_resampled = canml.preprocess.resample(data_clean, freq='100ms')
data_normalized = canml.preprocess.normalize(data_resampled)
# Extract features
features = canml.features.extract_stats(data_normalized, window='1s', stats=['mean', 'std'])
# Export to CSV
canml.export.to_csv(features, 'output.csv')
# Visualize data
canml.viz.plot_timeseries(data_normalized, signals=['EngineRPM', 'VehicleSpeed'])
For advanced usage, such as integrating with scikit-learn for anomaly detection, refer to the documentation.
Contributing
Contributions are welcome! To contribute:
- Fork the repository on GitHub.
- Create a new branch for your feature or bug fix.
- Submit a pull request with a clear description of your changes.
Please open an issue to discuss major changes before starting work.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Credits
- Inspired by
cantoolsandpython-canfor CAN bus parsing. - Built using pandas, NumPy, scikit-learn, and matplotlib for data manipulation, machine learning, and visualization.
- Special thanks to the Python community for their open-source contributions.
Contact
For questions or support, please open an issue on the GitHub repository.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file canml-0.1.3.tar.gz.
File metadata
- Download URL: canml-0.1.3.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
908106200a8bdb734bc26b103f69fd4d9ddb058f7eb833880ebe1f2cc84efde2
|
|
| MD5 |
5a95c716720c1f46bda567c3a0695f0f
|
|
| BLAKE2b-256 |
548b24389af62f1e0b6045ec40f1a3ad1747bb0a09478580793b3914aa227606
|
File details
Details for the file canml-0.1.3-py3-none-any.whl.
File metadata
- Download URL: canml-0.1.3-py3-none-any.whl
- Upload date:
- Size: 6.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a93f9ce9c11540cf6a1ad7f908cb1ba08a6c76284221000f9ad61574734f0555
|
|
| MD5 |
4b88f7a64dd953d94bae11897b4840da
|
|
| BLAKE2b-256 |
746ce5f87387aeb8a0ac64ea0ecdbf354f51b6202e408c2e69bab2369b10812b
|