Skip to main content

Plain-file persistence with explicit guarantees: atomic writes, cooperative locks, checksums, runtime environment inspection.

Project description

safeatomic

PyPI version Python versions CI codecov Formal models (TLA+) License: MIT

Plain-file persistence for Python with explicit, composable, runtime-inspectable guarantees.

The problem

# What this looks like:
config_path.write_text(json.dumps(state))

# What can actually happen:
# - process crashes after truncate, before write completes -> empty file
# - power loss after write returns -> data not yet on disk
# - two processes write concurrently -> interleaved result
# - cosmic ray / bad sector -> silent byte drift on read

Path.write_text() is a single syscall sequence. It is not a persistence protocol. For configuration, state, checkpoints, and any file you would be sad to lose, the application has to handle four separate concerns: atomic visibility, crash durability, cooperative writer exclusion, and integrity detection.

Hand-rolling that protocol at every call site is how production code ends up with truncated config files and corrupted state.

The solution

from safeatomic import write_atomic, read_atomic, atomic_yaml_dump

# Atomic write: a reader observes either the old or new content, never partial.
# Cooperative writer lock and parent-directory fsync are on by default.
write_atomic("config.json", '{"key": "value"}')

# Atomic read with integrity check: raises ChecksumMismatchError if the
# on-disk file diverges from its sidecar.
data = read_atomic("config.json", check_checksum=True)

# Format helpers compose atomic write with JSON / YAML / TOML.
atomic_yaml_dump("settings.yaml", {"theme": "dark"})

safeatomic packages the full atomic-write protocol — temp file, fsync, os.replace, parent-directory fsync, cooperative lock, optional checksum sidecar — behind one API. Each guarantee is opt-in, composable per call, and reported at runtime against your actual filesystem.

What it is, what it is not

Scope one plain file at a time, on a local POSIX filesystem
Sits between Path.write_text() and SQLite / DuckDB / LMDB
Not a database, a query engine, a distributed lock, a WAL, an append log
Targets Linux + ext4/xfs/btrfs/tmpfs, macOS + APFS
NonTarget Windows, NFS, SMB, object stores

See Alternatives for when to use safeatomic versus Path.write_text(), lock libraries, SQLite, DuckDB, LMDB/RocksDB, or JSONL.

The four guarantees

# Guarantee What it answers
1 AtomicVisibility "Will a concurrent reader ever see a half-written file?"
2 CrashDurability "If the process or machine dies after my write returned, will the data survive?"
3 WriterExclusion "Can two writers race and produce a logically interleaved result?"
4 IntegrityDetection "Will I notice if the bytes on disk silently differ from what I wrote?"

Each guarantee is opt-in, with safe defaults. safeatomic is not a lock library, not a fsync wrapper, not a checksum tool — it is one library where these four concerns are composable with a single API.

# Atomic visibility only (no lock, no checksum)
write_atomic("cache.json", data, concurrency="none")

# Add cooperative writer exclusion (default)
write_atomic("config.json", data)

# Add integrity detection via sidecar checksum
write_atomic("state.json", data, write_checksum=True)

# Combine all four
write_atomic("critical.json", data, concurrency="lock", write_checksum=True)

CrashDurability is always on for write_atomic (file + parent-directory fsync); opting out would defeat the library's core promise.

Inspect guarantees at runtime

Every guarantee has a documented level per environment (Guaranteed | BestEffort | NonTarget | Unsupported) and is queryable:

from safeatomic import inspect_guarantees

report = inspect_guarantees("/data/state.json")
print(report.environment.filesystem_class)        # "local_posix_persistent"
print(report.guarantees["AtomicVisibility"])      # "Guaranteed"
print(report.guarantees["CrashDurability"])       # "Guaranteed"
print(report.guarantees["WriterExclusion"])       # "Guaranteed"
print(report.guarantees["IntegrityDetection"])    # "Guaranteed"

inspect_guarantees returns the normative view: given the detected filesystem class, which guarantees does the matrix promise? It is cheap enough to call before every operation.

For an empirical view that actually exercises the syscalls — useful at application startup or for diagnostics — use doctor:

from safeatomic import doctor

report = doctor(
    "/data/state.json",
    destructive=True,                                  # run write probes
    require={"AtomicVisibility", "CrashDurability"},   # required guarantees
)
if not report.ok:
    raise RuntimeError(report.summary())

doctor probes the parent directory (existence, writability, exclusive create with 0o600, fsync on file and directory, os.replace, JSON sidecar round-trip, checksum sidecar round-trip). Probe files use the .safeatomic-doctor- prefix and are cleaned up in finally. Without destructive=True, probe-only checks are skipped and reported as unknown — the report still gives you the matrix view.

Safety policy

Every operation accepts a safety keyword to control how the library reacts when the environment cannot provide the requested guarantees:

write_atomic(path, data, safety="strict")       # default: raise UnsupportedEnvironmentError
write_atomic(path, data, safety="warn")         # execute, emit UnsupportedEnvironmentWarning
write_atomic(path, data, safety="best_effort")  # execute silently (caller takes responsibility)

move_atomic always refuses cross-device moves (CrossDeviceAtomicityError), regardless of safety — the function name promises atomicity, and silent fallback would break that. If the kernel returns EXDEV only at the final os.replace step (after a successful pre-check), the raw OSError is normalised into CrossDeviceAtomicityError with __cause__ set to the original OSError for diagnostics. See ADR-0008.

Parent-directory fsync after replace

After write_atomic and move_atomic have made the new file visible (os.replace), the library fsyncs the parent directory to confirm the directory-entry change has hit stable storage. If that final fsync fails:

  • safety="strict" — the underlying OSError is re-raised. The file is already visible; no rollback is attempted (the replace already committed the new content). The contract is "content may be new, CrashDurability is not confirmed".
  • safety="warn"UnsupportedEnvironmentWarning is emitted and the operation completes normally.
  • safety="best_effort" — silent.

See ADR-0011.

Checksum sidecars

write_atomic(..., checksum=True) writes a .sha256 sidecar next to the target. On the read side:

  • verify_checksum(path) — returns True on match, False on genuine digest mismatch, and raises FileNotFoundError when the sidecar is absent.
  • read_atomic(path, check_checksum=True) — returns the payload on match, raises ChecksumMismatchError on genuine digest mismatch, and raises FileNotFoundError when the sidecar is absent.

Absence and mismatch are distinct failure modes and are reported with distinct exception types. See ADR-0009.

Symbolic links

v2.0 declares SymlinkPolicy = Unspecified. The behaviour of write_atomic, move_atomic, read_atomic, and the format helpers when target (or any path component) is a symlink is not part of the public contract and may change in a future minor release.

Callers with symlink-sensitive workloads must resolve or reject symlinks themselves before calling into safeatomic — for example with Path.resolve(strict=True) followed by an explicit Path.is_symlink() check on the original argument. See ADR-0010.

Supported environments

  • Tier 1 (tested, full guarantees): Linux + ext4/xfs/btrfs/tmpfs; macOS + apfs
  • Tier 2 (expected to work, untested): FreeBSD, OpenBSD, NetBSD
  • Tier 3 (NonTarget): Windows / NTFS / ReFS, NFS, SMB

Under safety="strict" (default), unrecognised or NonTarget filesystems raise UnsupportedEnvironmentError before any I/O happens.

Requirements

  • Python ≥ 3.12
  • POSIX-like operating system

Installation

pip install safeatomic

To enable the ruamel YAML helpers (atomic_yaml_dump_ruamel, atomic_yaml_load_ruamel) for comment-and-order preservation:

pip install safeatomic[ruamel]

API surface

The full public API is 43 names exported from safeatomic. Internal modules are underscore-prefixed and are not part of the public contract.

  • IO core (7): AtomicWriter, AtomicReader, write_atomic, write_atomic_bytes, read_atomic, read_atomic_bytes, move_atomic
  • Locks (9): try_acquire_lock, release_lock, force_release_lock, is_locked, inspect_lock, get_lock_age, is_stale_lock, release_stale_lock, LockInfo
  • Checksum (6): compute_hash_file, compute_hash_data, verify_checksum, write_checksum_file, get_checksum_info, ChecksumInfo
  • Formats (8): atomic_json_dump/atomic_json_load, atomic_yaml_dump/atomic_yaml_load, atomic_yaml_dump_ruamel/atomic_yaml_load_ruamel (require [ruamel] extra), atomic_toml_dump/atomic_toml_load
  • Guarantees (3): inspect_guarantees, GuaranteeReport, Environment
  • Doctor (3): doctor, DoctorReport, DoctorCheck
  • Config (1): safeatomic_configContextVar-backed defaults for encoding, checksum_algo, retries, delay. Guarantee-affecting kwargs (safety, concurrency, preserve_metadata, write_checksum) cannot be set via config and must remain explicit at call sites.
  • Exceptions + warnings (6): SafeAtomicError, UnsupportedEnvironmentError, UnsupportedEnvironmentWarning, ChecksumMismatchError, CrossDeviceAtomicityError, LockError

See docs/index.md for the full reference.

Formal protocol models

safeatomic includes small TLA+ models for its abstract core protocol:

  • atomic replacement visibility (SafeAtomicSmoke);
  • cooperative lock lifecycle (SafeAtomicLock);
  • checksum sidecar verification (SafeAtomicChecksum).

These models are checked with TLC under documented assumptions. They do not verify the Python implementation, operating systems, filesystems, serializers, hardware, or deployment environments. os.replace atomicity, fsync durability, and PID semantics are assumptions of the model, not theorems about your machine.

The models, their configurations, the runner script, and the raw TLC output from the canonical run live in formal/ and formal/reports/. A summary is in docs/formal-models.md.

In practice this means a three-layer evidence stack: TLA+ fixes the contract; the test suite exercises the implementation against that contract; doctor() and inspect_guarantees() report the actual capabilities of the specific path you are using.

The formal/ directory is included in the source distribution but excluded from the installed wheel, so pip install safeatomic stays code-only.

What it is not

  • Not a database. No queries, no schema, no multi-record transactions.
  • Not a drop-in replacement for python-atomicwrites. Different API surface, different scope, different guarantees.
  • Not a distributed coordination primitive. Locks are cooperative whole-file locks on a single host.

The scope of the lock model is summarised as:

safeatomic provides cooperative whole-file coordination, not database concurrency control.

For cross-host coordination, use a database or a distributed lock manager (etcd, consul, redis). For multi-record transactions, use sqlite.

Versioning

Semantic Versioning with one extension: weakening any documented guarantee is a major version bump, even when no signatures change. See CHANGELOG.md.

Deprecated symbols live for at least one major version cycle before removal.

Logging

The library uses logging.getLogger("safeatomic") for diagnostics. It does not configure a handler; consumers configure logging as they wish.

Contributing

See CONTRIBUTING.md.

Security

See SECURITY.md for the vulnerability reporting policy.

License

MIT

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

safeatomic-2.0.3.tar.gz (156.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

safeatomic-2.0.3-py3-none-any.whl (66.2 kB view details)

Uploaded Python 3

File details

Details for the file safeatomic-2.0.3.tar.gz.

File metadata

  • Download URL: safeatomic-2.0.3.tar.gz
  • Upload date:
  • Size: 156.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for safeatomic-2.0.3.tar.gz
Algorithm Hash digest
SHA256 216c1228ed24f3c8bdf4f36e02cb22f45aafe21dc9c2238412664672e28cfaeb
MD5 04b809db53595b5221cfc5605e67da66
BLAKE2b-256 a08f2882cae2213d0a3da4c86baae4243dccbdb4f8cbddf7ad9abb25fb1b3b10

See more details on using hashes here.

File details

Details for the file safeatomic-2.0.3-py3-none-any.whl.

File metadata

  • Download URL: safeatomic-2.0.3-py3-none-any.whl
  • Upload date:
  • Size: 66.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.13

File hashes

Hashes for safeatomic-2.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 57aaf2eac75ef194a4eef321e6e7677da97036d3a8e843def3a11356c3f5342d
MD5 a395c46e521afda49091cb420e422cc5
BLAKE2b-256 7c31104d259c2eee2ad66aafb08f05a5d15afbc66dceb05d7bbdc2ec508e6ced

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page