Skip to main content

**Unified ingestion, caching, and audit layer Money Ex Machina**

Project description

mxm-dataio

Version License Python Checked with pyright

Unified ingestion, caching, and audit layer for Money Ex Machina.

Overview

mxm-dataio is Money Ex Machina’s lightweight ingestion and audit backbone.
It records every external interaction (Session → Request → Response),
persists exact payload bytes, and stores structured metadata in SQLite.

It is designed for deterministic reproducibility, offline caching,
and transparent provenance across all MXM data sources.

Architecture at a glance

mxm-dataio/
├── DataIoSession      → runtime context (one logical run)
├── Request / Response → atomic data transactions
├── adapters/          → pluggable fetch/send implementations
└── store/             → SQLite-backed metadata and byte storage

Each interaction is represented as:

Session ─┬─> Request ──> Response
          └─> Request ──> Response

Raw bytes and parsed metadata are stored under:

<root>/responses/<session>/<hash>.json
<root>/blobs/<session>/<hash>.bin

Core model

Concept Role
Session Groups a set of related requests; ensures atomic persistence.
Request Deterministic identity of an operation (method + URL + params + headers).
Response Archived payload, metadata, and audit fields.
Adapter Tiny class implementing fetch() or send() returning an AdapterResult.
Registry Runtime mapping from adapter name → adapter instance.

Runtime API

DataIoSession

The main entry point for ingestion or submission tasks.

from mxm_dataio.api import DataIoSession
from mxm_dataio.adapters import HttpFetcher
from mxm_config import load_config
from mxm_dataio.config.config import dataio_view

cfg = load_config(package="mxm-dataio", env="dev", profile="default")
dio_cfg = dataio_view(cfg)

# Register an adapter under a source name
register("http", HttpFetcher())  # implements Fetcher

# Use the session with that source name
with DataIoSession(source="http", cfg=dio_cfg) as io:
    req = io.request(kind="demo", params={"q": "mxm"})
    resp = io.fetch(req)
    print(resp.status, resp.checksum, resp.path)

AdapterResult objects contain both the raw payload and normalized metadata:

from typing import Any

class AdapterResult:
    data: bytes
    content_type: str | None
    transport_status: int | None
    url: str | None
    elapsed_ms: int | None
    headers: dict[str, str] | None
    adapter_meta: dict[str, Any] | None

Configuration

mxm-dataio reads its settings from the dataio subtree of the global MXM config. Downstream packages obtain read-only views via mxm_config.make_view.

Adapters

Adapters provide I/O logic while mxm-dataio handles persistence.

Example (simplified):

from typing import Any
from mxm_dataio.adapters import BaseFetcher
from mxm_dataio.types import AdapterResult
import requests

class HttpFetcher(BaseFetcher):
    def fetch(self, url: str, **params) -> AdapterResult:
        r = requests.get(url, params=params)
        return AdapterResult(
            payload=r.content,
            meta={"url": r.url, "headers": dict(r.headers)},
            content_type=r.headers.get("content-type"),
            status_code=r.status_code,
        )

Adapters can be registered dynamically:

from mxm_dataio.registry import register_adapter
register_adapter("http", HttpFetcher())

Quick examples

Fetch and cache a resource

session = DataIoSession(cfg=dio_cfg)
result = session.fetch("https://example.com/data.json", fetcher="http")
print(result.status_code)

The payload and metadata are stored automatically in SQLite + filesystem. Subsequent identical requests are served from cache unless force_refresh=True.

Send data to an API

result = session.send("https://api.example.com/upload", data=b"...", sender="http")
print(result.status_code)

Design principles

  • Deterministic: identical inputs yield identical request IDs.
  • Auditable: all payloads and headers persisted for replay.
  • Minimal dependencies: pure Python, no ORM or framework assumptions.
  • Composable: adapters plug into any MXM package via registry.
  • Readable data: SQLite + JSON + raw bytes, human-inspectable.

Testing & quality

All tests are pure-Python and hermetic—no network calls.
Configuration YAMLs are loaded directly from the repo using a temporary
MXM_CONFIG_HOME fixture. The project is validated with:

pytest -q
pyright --strict
ruff check .
black --check .

Roadmap

  • Async adapters (aiohttp, websockets).
  • Multi-backend persistence (S3, DuckDB).
  • Delta auditing and content hashing improvements.
  • CLI for session inspection and cache management.

Repository layout

mxm_dataio/
  adapters/       → built-in adapter implementations
  config/         → default YAMLs and view helpers
  store/          → persistence backend
  types.py        → protocol and dataclasses
tests/            → pytest suite (hermetic)

License

MIT © Money Ex Machina Unified ingestion, caching, and audit layer for the Money Ex Machina (MXM) ecosystem. mxm-dataio records every interaction with an external system—who/what/when, the exact bytes returned, and optional transport metadata—so downstream packages are reproducible and auditable.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mxm_dataio-0.2.2.tar.gz (17.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mxm_dataio-0.2.2-py3-none-any.whl (19.5 kB view details)

Uploaded Python 3

File details

Details for the file mxm_dataio-0.2.2.tar.gz.

File metadata

  • Download URL: mxm_dataio-0.2.2.tar.gz
  • Upload date:
  • Size: 17.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mxm_dataio-0.2.2.tar.gz
Algorithm Hash digest
SHA256 d3818dfcefff1c9572c57183ed826edc0b7928b9a277fd8f7e84be2775ab11db
MD5 e765ad7b50bf5ae6fc1e85b719cb433e
BLAKE2b-256 9016f3eb7d996485c7259e670a6fe466d45f76df42247c1b843305effa19a3b9

See more details on using hashes here.

Provenance

The following attestation bundles were made for mxm_dataio-0.2.2.tar.gz:

Publisher: release.yml on moneyexmachina/mxm-dataio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mxm_dataio-0.2.2-py3-none-any.whl.

File metadata

  • Download URL: mxm_dataio-0.2.2-py3-none-any.whl
  • Upload date:
  • Size: 19.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for mxm_dataio-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8b263fc514e0bdc9feabe7ac2a71b43ead9b109912f2c1ebfdbc7aecc36aafe0
MD5 e6464e2142624644ec167871a8dbd635
BLAKE2b-256 6d5e24ad2fdd6c51d6efd71a18d9b814e8fd04f40097e125374c5167758c43c4

See more details on using hashes here.

Provenance

The following attestation bundles were made for mxm_dataio-0.2.2-py3-none-any.whl:

Publisher: release.yml on moneyexmachina/mxm-dataio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page