Skip to main content

Automated Flexible I/O tester — profile storage and recommend an optimal I/O block size.

Project description

auto-fio — Automated Flexible I/O tester

Profile your storage device and get the optimal I/O block size — automatically.

Python 3.9+ License: MIT


Why

Block-sequential data pipelines (like CFM64) need one hardware-dependent number: the smallest block size that saturates your device's sequential-read throughput. Picking it by hand means running an fio sweep and eyeballing the knee of the curve. auto-fio automates exactly that — sweep, measure, detect the knee, return the number.

Install

Prerequisite: install fio. auto-fio drives the fio benchmark for its gold-standard measurements (it uses --direct=1 to bypass the OS page cache). Install it first so it's on your PATH:

# macOS
brew install fio
# Debian/Ubuntu
sudo apt install fio
# Fedora/RHEL
sudo dnf install fio

Then install auto-fio:

pip install auto-fio

Without fio, auto-fio falls back to a portable, zero-dependency python backend so a result is always available — but its numbers can be optimistic where the OS caches the file. Install fio for publication-grade accuracy. See Backends.

If fio is missing, auto-fio tells you at runtime (on stderr) with the exact install command for your OS, then continues on the python backend — it never installs anything or touches your system for you. --json stdout stays clean. Silence the notice with --backend python.

Use it — command line

auto-fio /path/on/the/device
auto-fio · backend=fio · threshold=95%

    512 KiB    1180.4 MB/s
      1 MiB    1910.2 MB/s
      2 MiB    2740.9 MB/s
      4 MiB    3120.5 MB/s
      7 MiB    3305.1 MB/s  <-- optimal
     16 MiB    3319.8 MB/s
     32 MiB    3322.0 MB/s

Optimal block size: 7340032 bytes (7.0 MiB) — reaches 95% of the 3322 MB/s peak.

Machine-readable output for scripts/CI: auto-fio /data --json.

Use it — Python

import auto_fio

# Just the number (bytes):
block_size = auto_fio.optimal_block_size("/data")

# Full sweep + metadata:
result = auto_fio.profile("/data", threshold=0.95)
print(result.optimal_block_size, result.peak_throughput_mbps, result.backend)

Backends

Backend When Accuracy
fio fio is on PATH Gold standard — uses --direct=1 to bypass the OS page cache
python fio unavailable (default fallback) Portable, zero-dependency; cache eviction is best-effort (F_NOCACHE/posix_fadvise), so numbers can be optimistic where the OS caches the file

Force one with --backend fio / --backend python, or backend=... in Python. For publication-grade measurements, install fio and use the fio backend; the python backend exists so a result is always available and to cross-check fio.

How it decides — the "knee"

auto-fio measures throughput at each block size, finds the peak, and returns the smallest block size that reaches threshold (default 95%) of that peak. Smaller is better once saturated: it means fewer bytes buffered per read for the same bandwidth. This decision rule lives in a pure, unit-tested function (auto_fio.detect_knee) so it can be validated independently of any benchmark.

Companion to CFM64

import auto_fio
from cfm64 import CFM64Loader, TextBlockDataset

bs = auto_fio.optimal_block_size("/data/train")
dataset = TextBlockDataset("/data/train.csv", block_size_bytes=bs)
loader = CFM64Loader(dataset, batch_size=64, seed=42)

Run auto-fio once per machine, commit the number, and your pipeline is tuned to that hardware — no manual fio step.

License

MIT — see LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

auto_fio-0.1.0.tar.gz (13.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

auto_fio-0.1.0-py3-none-any.whl (12.3 kB view details)

Uploaded Python 3

File details

Details for the file auto_fio-0.1.0.tar.gz.

File metadata

  • Download URL: auto_fio-0.1.0.tar.gz
  • Upload date:
  • Size: 13.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for auto_fio-0.1.0.tar.gz
Algorithm Hash digest
SHA256 dc7b67727b52b549134fc463b031e6cc8ab2f2e12e9bd8ffe9b83d3f90050f10
MD5 ef86118acf4042171afeec6c27f18e11
BLAKE2b-256 c48a0fc64511d1b8064280130a33e66d52078db2b8964ae0398daf85172ab4dd

See more details on using hashes here.

Provenance

The following attestation bundles were made for auto_fio-0.1.0.tar.gz:

Publisher: release.yml on anthony-celeres/auto-fio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file auto_fio-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: auto_fio-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 12.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.12

File hashes

Hashes for auto_fio-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 c01d3a3fc9dd04d0508294ec4a5c900d3f9b41618e0fcb3fe6e9afd2e22f2a17
MD5 4d5fcd0fe387a3b655ce802b2de615b5
BLAKE2b-256 84dfae6efff2e76a1140cb14791c1ac9fd6cfbd88ed493c87431c0c503d864c6

See more details on using hashes here.

Provenance

The following attestation bundles were made for auto_fio-0.1.0-py3-none-any.whl:

Publisher: release.yml on anthony-celeres/auto-fio

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page