Skip to main content

Decode CAN BLF logs using DBC files into pandas DataFrames and export to CSV

Project description

PyPI version Build Status License: MIT Documentation Status PyPI - Downloads

canml

canml is a Python toolkit for production‐scale decoding of CAN bus logs (BLF) using CAN.DBC definitions. It streams large BLF files into pandas DataFrames—either in chunks or all at once—and offers robust CSV and Parquet export, signal‐level filtering, DBC merging, and progress reporting.


Key Features

  • Merge DBC files
    Load one or multiple .dbc files into a single cantools Database, with optional signal‐name prefixing to avoid collisions.

  • Chunked streaming
    Decode arbitrarily large BLF logs in fixed‐size pandas DataFrame chunks, with an optional progress bar.

  • Full‐file mode
    Load an entire BLF into one DataFrame, with optional message‐ID filtering, uniform timestamp spacing, and injection of expected signals (NaN‐filled if missing).

  • Flexible export
    Incremental CSV export (to_csv) or single‐shot Parquet export (to_parquet).

  • Signal & message filtering
    Only decode specified CAN IDs or automatically add missing signals for downstream consistency.


Installation

pip install canml

Dependencies:

  • Python ≥ 3.8, < 4.0
  • cantools ≥ 39.4.4
  • python-can ≥ 4.4.0
  • pandas ≥ 2.2.2
  • numpy ≥ 1.26.4
  • tqdm ≥ 4.0.0
  • pyarrow ≥ 11.0.0

Usage Quickstart

    from canml.canmlio import (
        load_dbc_files,
        iter_blf_chunks,
        load_blf,
        to_csv,
        to_parquet
    )

    # 1. Merge multiple DBCs (with optional signal‐prefixing)
    db = load_dbc_files(["powertrain.dbc", "chassis.dbc"], prefix_signals=True)

    # 2. Stream‐decode a large BLF in 50k‐row chunks, filtering only IDs 0x100 & 0x200
    for idx, df_chunk in enumerate(iter_blf_chunks(
            blf_path="vehicle.blf",
            db=db,
            chunk_size=50_000,
            filter_ids={0x100, 0x200}
        )):
        to_parquet(df_chunk, f"shard-{idx:03}.parquet")

    # 3. Load a smaller BLF fully, enforce uniform 10 ms timestamps,
    #    and ensure specific signals (even if missing) appear as NaN
    df_full = load_blf(
        blf_path="session0.blf",
        db=db,
        message_ids={0x100, 0x200},
        expected_signals=["EngineData_EngineRPM", "BrakeStatus_ABSActive"],
        force_uniform_timing=True,
        interval_seconds=0.01
    )

    # 4. Export to CSV
    to_csv(df_full, "session0_decoded.csv")

API Reference

load_dbc_files(dbc_paths, prefix_signals=False) → Database

Load one or more DBC files into a merged cantools Database.

dbc_paths – single path or list of .dbc file paths

prefix_signals – if True, renames signals to "<MessageName>_<SignalName>".

iter_blf_chunks(blf_path, db, chunk_size=10000, filter_ids=None) → Iterator[DataFrame]

Stream‐decode a BLF into DataFrame chunks.

blf_path – path to .blf file

db – Database from load_dbc_files

chunk_size – max rows per DataFrame

filter_ids – set of CAN IDs to include

load_blf(blf_path, db, message_ids=None, expected_signals=None, force_uniform_timing=False, interval_seconds=0.01) → DataFrame

Decode an entire BLF into one DataFrame.

message_ids – restrict to given CAN IDs

expected_signals – list of columns to inject as NaN if missing

force_uniform_timing – override raw timestamps with uniform spacing

to_csv(df_or_iter, output_path, mode='w', header=True) → None

Write DataFrame or iterator of DataFrames to CSV.

to_parquet(df, output_path, compression='snappy') → None

Write DataFrame to Parquet (pyarrow engine).

Contributing

Contributions are welcome! To contribute:

  1. Fork the repository on GitHub.
  2. Create a new branch for your feature or bug fix.
  3. Submit a pull request with a clear description of your changes.
  4. To update the docu:
pip install sphinx sphinx-rtd-theme

Please open an issue to discuss major changes before starting work.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Credits

  • Inspired by cantools and python-can for CAN bus parsing.
  • Built using pandas, NumPy, scikit-learn, and matplotlib for data manipulation, machine learning, and visualization.
  • Special thanks to the Python community for their open-source contributions.

Contact

For questions or support, please open an issue on the GitHub repository.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

canml-0.1.6.tar.gz (15.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

canml-0.1.6-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file canml-0.1.6.tar.gz.

File metadata

  • Download URL: canml-0.1.6.tar.gz
  • Upload date:
  • Size: 15.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for canml-0.1.6.tar.gz
Algorithm Hash digest
SHA256 26656e23664532deb3478a2e742f1d99bdc5aad862b9f45a4e376bfb828737b4
MD5 74dc814f3c128fbe64a2fbec5ab2f840
BLAKE2b-256 5973d13b622e67f35fd46336079dca7c76221ab7df358772b442f63cd96e8f94

See more details on using hashes here.

File details

Details for the file canml-0.1.6-py3-none-any.whl.

File metadata

  • Download URL: canml-0.1.6-py3-none-any.whl
  • Upload date:
  • Size: 8.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.12.10

File hashes

Hashes for canml-0.1.6-py3-none-any.whl
Algorithm Hash digest
SHA256 a08d26b79e74f420ee7078fef0ee2b8b21f81a2946ffbdecb74a072340239d7b
MD5 2dd5bb10019a7bee6eb5277f5fcba917
BLAKE2b-256 d3f96355a09cf624f832b86c1ca78e3583165f47bb79828a472111f351a106e6

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page