Fast streaming reader for AEMO MDFF NEM12 / NEM13 metering files. Zero required dependencies.

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

These details have not been verified by PyPI

Project description

aemo-mdff-reader

Fast, zero-dependency streaming reader for AEMO NEM12 and NEM13 metering files. Implements AEMO MDFF (Meter Data File Format) v2.6.

O(1) memory — iterate through millions of intervals.
Pure stdlib core; pandas / PyMySQL are opt-in extras.
~2 M readings/sec on the columnar fast path.
Includes an aemo-mdff-reader CLI.

Install

pip install aemo-mdff-reader

# optional extras
pip install aemo-mdff-reader[pandas]   # to_dataframe() / parquet
pip install aemo-mdff-reader[mysql]    # SQL persistence

Use

from aemo_mdff_reader import parse

for r in parse("metering.csv"):
    print(r.nmi, r.interval_start, r.value, r.uom)

Or as a flat CSV / DataFrame:

from aemo_mdff_reader import parse, write_csv, to_dataframe

write_csv(parse("metering.csv"), "out.csv")     # no pandas
df = to_dataframe("metering.csv")                # needs [pandas]

From the command line:

aemo-mdff-reader metering.csv -o out.csv
aemo-mdff-reader metering.csv --validate                       # spec check
aemo-mdff-reader metering.csv --nmi NMI1234567 --start 2024-01-01 --end 2024-01-31
aemo-mdff-reader manual.csv --records accumulations            # NEM13

Working with the data

Each parsed record is a slots-based class with named attributes plus a to_dict() for JSON / dict pipelines:

for r in parse("metering.csv"):
    payload = r.to_dict()              # {"nmi": "...", "value": 0.12, ...}
    print(r.quality_flag, r.method_flag)  # split of the QMM field, e.g. "S", "52"

For aggregation, aemo_mdff_reader.aggregate provides streaming helpers:

from aemo_mdff_reader import parse
from aemo_mdff_reader.aggregate import group_by_nmi, daily_totals

for key, group in group_by_nmi(parse("metering.csv")):
    # key = ChannelKey(nmi, register_id, nmi_suffix)
    intervals = list(group)

for day in daily_totals(parse("metering.csv")):
    # day.total, day.interval_count, day.unique_quality_flags
    print(day.nmi, day.interval_date.date(), day.total, day.uom)

End-to-end recipes — load + inspect, daily roll-up, filter to pandas, spec validation — live in examples/.

API at a glance

You want	Call
300 interval readings (NEM12)	`parse(src)`
250 accumulations (NEM13)	`parse_accumulations(src)`
Both, in file order	`parse_all(src)`
400 quality / event flags	`parse_events(src)`
500 / 550 B2B transactions	`parse_b2b(src)`
Just the 100 header	`parse_header(src)`
Build a pandas DataFrame	`to_dataframe(src)`
Write a flat CSV (no pandas)	`write_csv(rows, out)`
Validate against AEMO MDFF v2.6	`validate_file(src)`
Compute / verify an NMI checksum	`nmi_checksum`, `validate_nmi`
Group readings by NMI / channel	`aggregate.group_by_nmi(rows)`
Roll up to daily totals	`aggregate.daily_totals(rows)`
Convert any record to a plain dict	`r.to_dict()`

src can be a path, a file-like object, an iterable of CSV lines, or an iterable of pre-split rows. The v1 NEMReader facade (read_from_file, to_dataframe, to_csv) still works.

Each parse(...) yields an IntervalReading with nmi, meter_serial_number, register_id, nmi_suffix, uom, interval_length, interval_date, interval_start, interval_end, interval_index, value, quality_method, reason_code, reason_description, update_datetime, msats_load_datetime. See the type stubs (from aemo_mdff_reader import IntervalReading) for the exact signatures.

Notes

Spec: AEMO Meter Data File Format Specification NEM12 & NEM13, v2.6 (effective 29 September 2024). Records 100, 200, 250, 300, 400, 500, 550, 900 are all surfaced; unknown indicators are ignored. Allowed values for quality flags, transaction codes, reason codes, units of measure, and direction indicators are exposed as constants in aemo_mdff_reader.spec for callers that want stricter validation than the parser performs.
Tolerant: UTF-8 BOM is consumed silently, LF and CRLF both work, and empty interval cells are coerced to 0.0 (use quality_method to distinguish missing from zero). Datetime fields accept the spec forms (YYYYMMDD, YYYYMMDDhhmmss) and a few common non-spec variants (YYYY-MM-DD, ISO YYYY-MM-DDTHH:MM:SS, with or without a Z / ±HH:MM / ±HHMM timezone suffix — the suffix is stripped and parsed datetimes are returned naive). direction_indicator on 250 records passes through whatever the file emits; the spec set is spec.DIRECTION_INDICATORS = {"I", "E"} but B and N appear in the wild. The parser also accepts non-spec IntervalLengths (1, 60, etc.); strict callers should compare against spec.ALLOWED_INTERVAL_LENGTHS (= {5, 15, 30}).
Migration from v1: NEMReader still works. The internal aemo_mdff_reader.nemstructure package is gone — see the API table above. pandas is now opt-in. See CHANGELOG.md for details.

Performance

420,480 readings (4 NMIs × 365 days × 5-min, 2.8 MiB CSV), Python 3.11:

Operation	Time
`for r in parse(path): ...`	0.45 s
`parse_to_columns(path)`	0.21 s
`to_dataframe(path)` (pandas)	0.76 s

~2.7× faster than v1 end-to-end; reproduce with python benchmarks/bench_parser.py.

Large files

The parser is built to scale to gigabyte-class NEM12 files without loading them into RAM. Measured peak memory delta on a synthetic 10.5 M-reading file (100 NMIs × 365 days × 5-min, 71 MiB CSV), Python 3.12:

API	Memory profile	Peak Δ
`for r in parse(path): ...`	streaming	1.3 MiB
`daily_totals(parse(path))`	streaming	0 MiB
`write_csv(parse(path), out)`	streaming	0 MiB
`iter_dataframes(path, chunk_size=N)`	bounded O(N)	~30 MiB / 100k
`iter_columns_chunks(path, chunk_size=N)`	bounded O(N)	~10 MiB / 100k
`parse_to_columns(path)`	full materialise	~600 MiB
`list(parse(path))` / `to_dataframe(path)` / `NEMReader.read_from_file()`	full materialise	~2.5 GiB

Rule of thumb: stay on the streaming or chunked APIs for any file larger than a few hundred MiB. The chunked variants make pandas-based workflows safe on arbitrarily large inputs:

from aemo_mdff_reader import iter_dataframes

# Process a multi-GiB file 50,000 readings at a time.
for df in iter_dataframes("huge.csv", chunk_size=50_000):
    daily = df.groupby(["NMI", "IntervalDate"])["Value"].sum()
    daily.to_csv("out.csv", mode="a", header=False)

The NEMReader facade and to_dataframe(path) materialise their inputs by design (so len(reader) and df.iloc[...] work). Avoid them for files that won't fit in RAM.

Development

git clone https://github.com/Utilified/aemo-mdff-reader.git
cd aemo-mdff-reader
pip install -e .[dev]
pytest

CI runs ruff, mypy --strict, the test matrix on Python 3.11 → 3.12 / Linux / macOS / Windows, pip-audit, bandit, CodeQL, OpenSSF Scorecard, and a wheel-install smoke test.

Releases are automated by release-please from Conventional Commits on main, then signed with sigstore, attested with SLSA build provenance and a CycloneDX SBOM, and published to PyPI via Trusted Publishing. See CONTRIBUTING.md for the contributor commit conventions and the full release flow.

License

MIT — see LICENSE.

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

cohenrobinson

These details have not been verified by PyPI

Release history Release notifications | RSS feed

2.2.1

May 10, 2026

This version

2.2.0

May 10, 2026

2.1.0

May 10, 2026

2.0.4

May 10, 2026

2.0.2

May 10, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aemo_mdff_reader-2.2.0.tar.gz (58.8 kB view details)

Uploaded May 10, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aemo_mdff_reader-2.2.0-py3-none-any.whl (35.7 kB view details)

Uploaded May 10, 2026 Python 3

File details

Details for the file aemo_mdff_reader-2.2.0.tar.gz.

File metadata

Download URL: aemo_mdff_reader-2.2.0.tar.gz
Upload date: May 10, 2026
Size: 58.8 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for aemo_mdff_reader-2.2.0.tar.gz
Algorithm	Hash digest
SHA256	`a43d8c3ee0e68c3e2cc31a56278cd80d2b7522d6db5d6045f525b7226d8f4689`
MD5	`85d1882b346a48528fb503c5a2cb2254`
BLAKE2b-256	`364cb4409a47cbc76e9c380b25f83f4e9c1003248e6fa548e73481f463172e65`

See more details on using hashes here.

Provenance

The following attestation bundles were made for aemo_mdff_reader-2.2.0.tar.gz:

Publisher: release.yml on Utilified/aemo-mdff-reader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: aemo_mdff_reader-2.2.0.tar.gz
- Subject digest: a43d8c3ee0e68c3e2cc31a56278cd80d2b7522d6db5d6045f525b7226d8f4689
- Sigstore transparency entry: 1494870820
- Sigstore integration time: May 10, 2026
Source repository:
- Permalink: Utilified/aemo-mdff-reader@a2180a43c16b8a0c27ed6b33447506d7b835f207
- Branch / Tag: refs/tags/v2.2.0
- Owner: https://github.com/Utilified
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a2180a43c16b8a0c27ed6b33447506d7b835f207
- Trigger Event: workflow_dispatch

File details

Details for the file aemo_mdff_reader-2.2.0-py3-none-any.whl.

File metadata

Download URL: aemo_mdff_reader-2.2.0-py3-none-any.whl
Upload date: May 10, 2026
Size: 35.7 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for aemo_mdff_reader-2.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`8d54685e04acc010a26aa333c6fa80f0e9813baa870188a8ae87960e155e43e9`
MD5	`aa54d31cddf3cecda90f2e51e74462fd`
BLAKE2b-256	`7e475b4d41596394feab76f9ebe78c21d09dfbf998486681bd67accb4332f274`

See more details on using hashes here.

Provenance

The following attestation bundles were made for aemo_mdff_reader-2.2.0-py3-none-any.whl:

Publisher: release.yml on Utilified/aemo-mdff-reader

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: aemo_mdff_reader-2.2.0-py3-none-any.whl
- Subject digest: 8d54685e04acc010a26aa333c6fa80f0e9813baa870188a8ae87960e155e43e9
- Sigstore transparency entry: 1494870985
- Sigstore integration time: May 10, 2026
Source repository:
- Permalink: Utilified/aemo-mdff-reader@a2180a43c16b8a0c27ed6b33447506d7b835f207
- Branch / Tag: refs/tags/v2.2.0
- Owner: https://github.com/Utilified
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: release.yml@a2180a43c16b8a0c27ed6b33447506d7b835f207
- Trigger Event: workflow_dispatch

aemo-mdff-reader 2.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

aemo-mdff-reader

Install

Use

Working with the data

API at a glance

Notes

Performance

Large files

Development

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance