Converters and intake QA for battery cycler data.
Project description
Battery Data Standard
battery-data-standard is a local Python library and command-line tool for
turning battery cycler exports into validated, analysis-ready CSV or Parquet
outputs with auditable conversion reports. It is intended for laboratories,
battery test teams, and data pipelines that need a repeatable private path from
vendor files to downstream tools such as DuckDB, Polars, PyBaMM, PyProBE,
cellpy, BEEP, and Battery Archive-style workflows.
The package is vendor-neutral and independent. It is not certified by any cycler vendor or standards body. Adapter support describes behavior implemented and validated by this project; users should verify representative exports from their own cycler software before using the package in automated production workflows.
Installation
Python 3.10 or newer is required.
pip install --upgrade battery-data-standard
Optional extras are available for additional input formats:
pip install "battery-data-standard[yaml]"
pip install "battery-data-standard[matlab]"
pip install "battery-data-standard[mpr]"
The package installs the bds command and exposes both the full package name
and a short import alias:
import battery_data_standard as bds
# or
import bds
Scope
The package provides:
- conversion of supported battery cycler time-series exports to standardized CSV or Parquet files;
- conversion of supported EIS tables to a standardized EIS table;
- cycler detection, data-kind detection, validation, conversion reports, and batch manifests;
- single-file diagnostics that explain detection, mapping, current-sign evidence, repairs, validation, and recommended next actions;
- intake audit reports with file-level conversion quality scores;
- archive-aware batch conversion for directories, zip archives, and tar archives;
- optional profile files for lab-specific column naming.
The package does not upload source data to an external service. It reads local files and writes local outputs.
Command-Line Usage
Inspect the installed version:
bds --version
Detect a cycler export:
bds detect raw_export.csv
Explain how one file will be detected, mapped, validated, and exported:
bds explain raw_export.csv --text
bds explain raw_export.csv --html report.html --xlsx report.xlsx --json report.json
Convert a time-series file:
bds convert raw_export.csv normalized.bds.csv --cycler auto --report auto
--report auto writes normalized.bds.report.json and
normalized.bds.report.pdf next to the converted CSV. Add
--report-format html or --report-format xlsx when those review formats are
needed.
Export directly to a downstream staging format:
bds convert raw_export.csv pybamm_drive_cycle.csv --target pybamm
bds convert raw_export.csv pyprobe_staging.parquet --target pyprobe --format parquet
Validate a converted file:
bds validate normalized.bds.csv
Convert a directory or archive and write a JSONL manifest:
bds batch raw_exports normalized_exports --recursive --manifest manifest.jsonl
bds batch raw_exports.zip normalized_exports --manifest manifest.jsonl
Audit a raw folder before committing to conversion:
bds audit raw_exports --recursive --json audit.json --html audit.html
The audit report scores each raw data file and highlights conversion failures, missing required fields, unit conversions, time-axis repairs, current-sign evidence, duplicate timestamps, non-monotonic time, suspicious flat voltage/current, and cycle/step anomalies. Optional-column coverage is reported separately so small but valid fixtures are not treated as low-quality data just because they omit fields such as temperature or energy.
Inspect runtime adapter metadata and the pinned schema:
bds formats
bds inspect-schema
Python API
Read a supported time-series export into a Polars dataframe:
import bds
df = bds.read("raw_export.csv", cycler="auto")
Show the user-facing export column names defined in the export template:
import bds
export_df = bds.to_export_frame(df)
print(export_df.columns)
Preserve Raw Current Sign And Repair Time Axis
For real experimental datasets, it is often useful to preserve the current sign exactly as recorded by the source file, allow documented time-axis repairs, and record any regular-sampling gaps:
df = bds.read(
path,
cycler="auto",
current_sign="preserve",
repair_policy="repair",
time_sampling_policy="repair",
)
Use current_sign="preserve" when downstream analysis should keep the raw
charge/discharge sign convention from the instrument. Use
repair_policy="repair" when the pipeline accepts documented normalizations
such as shifting elapsed test time to start at zero or sorting non-monotonic time
values.
When a fixed sampling interval is detected, BDS checks for missing time points.
By default, missing points on that regular grid are interpolated with
time_sampling_interpolation="linear" and recorded in the conversion report.
Use time_sampling_policy="warn" to report gaps without inserting rows, or set
time_sampling_interval_s=1, 2, 10, or another known interval when the
sampling cadence is defined by the test protocol.
Use an explicit cycler when the source format is known:
df = bds.read("arbin_export.csv", cycler="arbin")
Convert a file and keep the conversion report:
report = bds.convert(
"raw_export.csv",
"normalized.bds.csv",
cycler="auto",
report_path="auto",
)
This writes the converted CSV plus JSON and PDF reports. HTML and Excel reports
can be added with report_formats=("html", "xlsx").
Write a downstream staging table by selecting an export target:
bds.convert("raw_export.csv", "pybamm_drive_cycle.csv", target="pybamm")
bds.convert("raw_export.csv", "cellpy_staging.csv", target="cellpy")
Read data and report information in memory:
df, report = bds.read_with_report("raw_export.csv", cycler="auto", strict=False)
Create step-level or cycle-level summaries from a normalized dataframe:
steps = bds.summarize_steps(df)
cycles = bds.summarize_cycles(df)
Output Model
The converter standardizes supported cycler exports into a BDS time-series table. Every successful default export contains three required fields:
| Field | Unit | Description |
|---|---|---|
Test Time (s) |
s | Elapsed time from the start of the test. |
Voltage (V) |
V | Measured cell or channel voltage. |
Current (A) |
A | Measured current. |
Additional fields, such as cycle number, step number, capacity, energy, temperature, power, and internal resistance, are included when they are available in the source file.
The default BDS CSV and Parquet exports use user-facing labels with units in
parentheses. Lower-level adapter data may use internal labels; prefer
bds convert or to_export_frame(..., target="bds") for public handoff.
Supported Format Families
The package includes adapters for NEWARE, Arbin, Maccor, BioLogic, Repower, PEC, Novonix, BaSyTec, LANDT, and generic tabular exports. Generic readers support delimited text, Excel, MATLAB, and Parquet inputs where the file contains or can be mapped to time, voltage, and current columns.
BioLogic .mpt text exports are supported by default. Binary BioLogic .mpr
files are supported through the optional mpr extra, which installs the
galvani backend.
See docs/supported-formats.md for adapter scope and support-tier definitions.
EIS Data
EIS files use a separate standardized table from row-wise time-series data. Use EIS-specific commands or API functions for known impedance files:
bds detect-kind impedance.csv
bds convert-eis impedance.csv normalized.eis.csv
eis = bds.read_eis("impedance.csv")
report = bds.convert_eis("impedance.csv", "normalized.eis.csv")
read() and convert() are time-series entry points. batch and
batch_convert() can route mixed directories and archives that contain
time-series files, EIS files, including Gamry .DTA ZCURVE files, and
unsupported helper files.
Profiles
Profiles map lab-specific column names to canonical column names. JSON profiles
are supported by the base installation. YAML profiles require the yaml extra.
{
"columns": {
"test_time": "time_seconds",
"voltage": "cell_voltage",
"current": "cell_current"
}
}
Use a profile from the CLI:
bds convert lab_export.csv normalized.bds.csv --cycler generic --profile profile.json
Current Sign Convention
The default current convention is charge-positive and discharge-negative.
Use --current-sign preserve to retain the source sign convention, or
--current-sign discharge-positive when a downstream workflow requires
discharge-positive current. When a source file contains a recognizable
charge/discharge status column, adapters use it to normalize current sign more
explicitly.
Validation and Reports
Every conversion returns a machine-readable report with schema version, row
count, columns, validation status, warnings, provenance, adapter metadata, repair
operations, and time-sampling findings. With report_path="auto", BDS writes
JSON and PDF reports by default; HTML and Excel reports are optional formats.
Strict validation is enabled by default. Repairable issues are reported with the
default repair_policy="warn". Use repair_policy="repair" or
--repair-policy repair only when the pipeline explicitly accepts the documented
normalizations.
Documentation
Public documentation is in the docs directory:
- Python API reference
- Supported formats
- Export template
- Export targets
- Schema compatibility
- Diagnostics and audit reports
- Current sign convention
- Step and cycle semantics
- BDF compatibility status
- Warning and issue codes
- Ecosystem integrations
License and Attribution
This project is distributed under the MIT License. See NOTICE.md for the project independence notice.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file battery_data_standard-0.2.2.tar.gz.
File metadata
- Download URL: battery_data_standard-0.2.2.tar.gz
- Upload date:
- Size: 115.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
9cc1856f22e50e30d50143b169ee5beaee9f6c7a241961366f90ffd8a22d040d
|
|
| MD5 |
25ea91bce3e6412d019f197a774a1c71
|
|
| BLAKE2b-256 |
f94cb9dc6908abce8c152d731545a3b9c0bba17c6db3aef6efc89c65f1d8fa13
|
File details
Details for the file battery_data_standard-0.2.2-py3-none-any.whl.
File metadata
- Download URL: battery_data_standard-0.2.2-py3-none-any.whl
- Upload date:
- Size: 106.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.10
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
66a1f5391dd19b3eaee7cf3c859f2f5507b8e2c6606c2eb64c8b8bb9db631957
|
|
| MD5 |
0b55ac196a0ec9f4d8ea2540713cfef5
|
|
| BLAKE2b-256 |
3634ff1cc5e6b4347f46f9456c37f53c43e5cf828ac475bc9121cf8a8201d56b
|