In-process virtual filesystem with hard quota for Python

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

nightmarewalker

These details have not been verified by PyPI

Project description

D-MemFS

An in-process virtual filesystem with hard quota enforcement for Python.

Languages: English | Japanese

Why MFS?

MemoryFileSystem gives you a fully isolated filesystem-like workspace inside a Python process.

Hard quota (MFSQuotaExceededError) to reject oversized writes before OOM
Hierarchical directories and multi-file operations (import_tree, copy_tree, move)
File-level RW locking + global structure lock for thread-safe operations
Free-threaded Python compatible (PYTHON_GIL=0) — stress-tested under 50-thread contention
Async wrapper (AsyncMemoryFileSystem) powered by asyncio.to_thread
Zero runtime dependencies (standard library only)

This is useful when io.BytesIO is too primitive (single buffer), and OS-level RAM disks/tmpfs are impractical (permissions, container policy, Windows driver friction).

Installation

pip install D-MemFS

Requirements: Python 3.11+

Quick Start

from dmemfs import MemoryFileSystem, MFSQuotaExceededError

mfs = MemoryFileSystem(max_quota=64 * 1024 * 1024)

mfs.mkdir("/data")
with mfs.open("/data/hello.bin", "wb") as f:
    f.write(b"hello")

with mfs.open("/data/hello.bin", "rb") as f:
    print(f.read())  # b"hello"

print(mfs.listdir("/data"))
print(mfs.is_file("/data/hello.bin"))  # True

try:
    with mfs.open("/huge.bin", "wb") as f:
        f.write(bytes(512 * 1024 * 1024))
except MFSQuotaExceededError as e:
    print(e)

API Highlights

`MemoryFileSystem`

open(path, mode, *, preallocate=0, lock_timeout=None)
mkdir, remove, rmtree, rename, move, copy, copy_tree
listdir, exists, is_dir, is_file, walk, glob
stat, stats, get_size
export_as_bytesio, export_tree, iter_export_tree, import_tree

Constructor parameters:

max_quota (default 256 MiB): byte quota for file data
max_nodes (default None): optional cap on total node count (files + directories). Raises MFSNodeLimitExceededError when exceeded.
default_storage (default "auto"): storage backend for new files — "auto" / "sequential" / "random_access"
promotion_hard_limit (default None): byte threshold above which Sequential→RandomAccess auto-promotion is suppressed (None uses the built-in 512 MiB limit)
chunk_overhead_override (default None): override the per-chunk overhead estimate used for quota accounting

Note: The BytesIO returned by export_as_bytesio() is outside quota management. Exporting large files may consume significant process memory beyond the configured quota limit.

Supported binary modes: rb, wb, ab, r+b, xb

`MemoryFileHandle`

read, write, seek, tell, truncate, flush, close
file-like capability checks: readable, writable, seekable

flush() is intentionally a no-op (compatibility API for file-like integrations).

`stat()` return (`MFSStatResult`)

size, created_at, modified_at, generation, is_dir

Supports both files and directories
For directories: size=0, generation=0, is_dir=True

Text Mode

D-MemFS natively operates in binary mode. For text I/O, use MFSTextHandle:

from dmemfs import MemoryFileSystem, MFSTextHandle

mfs = MemoryFileSystem()
mfs.mkdir("/data")

# Write text
with mfs.open("/data/hello.bin", "wb") as f:
    th = MFSTextHandle(f, encoding="utf-8")
    th.write("こんにちは世界\n")
    th.write("Hello, World!\n")

# Read text line by line
with mfs.open("/data/hello.bin", "rb") as f:
    th = MFSTextHandle(f, encoding="utf-8")
    for line in th:
        print(line, end="")

MFSTextHandle is a thin, bufferless wrapper. It encodes on write() and decodes on read() / readline(). Unlike io.TextIOWrapper, it introduces no buffering issues when used with MemoryFileHandle.

Use Case Tutorials

ETL Staging

Stage data through raw → processed → output directories:

from dmemfs import MemoryFileSystem

mfs = MemoryFileSystem(max_quota=16 * 1024 * 1024)
mfs.mkdir("/raw")
mfs.mkdir("/processed")

raw_data = b"id,name,value\n1,foo,100\n2,bar,200\n"
with mfs.open("/raw/data.csv", "wb") as f:
    f.write(raw_data)

with mfs.open("/raw/data.csv", "rb") as f:
    data = f.read()

with mfs.open("/processed/data.csv", "wb") as f:
    f.write(data.upper())

mfs.rmtree("/raw")  # cleanup staging

Archive-like Operations

Store, list, and export multiple files as a tree:

from dmemfs import MemoryFileSystem

mfs = MemoryFileSystem()
mfs.import_tree({
    "/archive/doc1.bin": b"Document 1",
    "/archive/doc2.bin": b"Document 2",
    "/archive/sub/doc3.bin": b"Document 3",
})

print(mfs.listdir("/archive"))  # ['doc1.bin', 'doc2.bin', 'sub']

snapshot = mfs.export_tree(prefix="/archive")  # dict of {path: bytes}

SQLite Snapshot

Serialize an in-memory SQLite DB into MFS and restore it later:

import sqlite3
from dmemfs import MemoryFileSystem

mfs = MemoryFileSystem()
conn = sqlite3.connect(":memory:")
conn.execute("CREATE TABLE t (id INTEGER, val TEXT)")
conn.execute("INSERT INTO t VALUES (1, 'hello')")
conn.commit()

with mfs.open("/snapshot.db", "wb") as f:
    f.write(conn.serialize())
conn.close()

with mfs.open("/snapshot.db", "rb") as f:
    raw = f.read()
restored = sqlite3.connect(":memory:")
restored.deserialize(raw)
rows = restored.execute("SELECT * FROM t").fetchall()  # [(1, 'hello')]

Concurrency and Locking Notes

Path/tree operations are guarded by _global_lock.
File access is guarded by per-file ReadWriteLock.
lock_timeout behavior:
- None: block indefinitely
- 0.0: try-lock (fail immediately with BlockingIOError)
- > 0: timeout in seconds, then BlockingIOError
Current ReadWriteLock is non-fair: under sustained read load, writers can starve.

Operational guidance:

Keep lock hold duration short
Set an explicit lock_timeout in latency-sensitive code paths
walk() and glob() provide weak consistency: each directory level is snapshotted under _global_lock, but the overall traversal is NOT atomic. Concurrent structural changes may produce inconsistent results.

Async Usage

from dmemfs import AsyncMemoryFileSystem

async def run() -> None:
    mfs = AsyncMemoryFileSystem(max_quota=64 * 1024 * 1024)
    await mfs.mkdir("/a")
    async with await mfs.open("/a/f.bin", "wb") as f:
        await f.write(b"data")
    async with await mfs.open("/a/f.bin", "rb") as f:
        print(await f.read())

Benchmarks

Minimal benchmark tooling is included:

MFS vs io.BytesIO vs PyFilesystem2 (MemoryFS) vs tempfile
Cases: many-small-files and stream write/read
Optional report output to benchmarks/results/

Note: As of setuptools 82 (February 2026), pyfilesystem2 fails to import due to a known upstream issue (#597). Benchmark results including PyFilesystem2 were measured with setuptools ≤ 81 and are valid as historical comparison data.

Run:

uvx --with-requirements requirements.txt --with-editable . python benchmarks/compare_backends.py --save-md auto --save-json auto

See BENCHMARK.md for details.

Latest benchmark snapshot:

benchmark_current_result.md

Testing and Coverage

Test execution and dev flow are documented in TESTING.md.

Typical local run:

uv pip compile requirements.in -o requirements.txt
uvx --with-requirements requirements.txt --with-editable . pytest tests/ -v --timeout=30 --cov=dmemfs --cov-report=xml --cov-report=term-missing

CI (.github/workflows/test.yml) runs tests with coverage XML generation.

API Docs Generation

API docs can be generated as Markdown (viewable on GitHub) using pydoc-markdown:

uvx --with pydoc-markdown --with-editable . pydoc-markdown '{
  loaders: [{type: python, search_path: [.]}],
  processors: [{type: filter, expression: "default()"}],
  renderer: {type: markdown, filename: docs/api_md/index.md}
}'

Or as HTML using pdoc (local browsing only):

uvx --with-requirements requirements.txt pdoc dmemfs -o docs/api

API Reference (Markdown)

Compatibility and Non-Goals

Core open() is binary-only (rb, wb, ab, r+b, xb). Text I/O is available via the MFSTextHandle wrapper.
No symlink/hardlink support — intentionally omitted to eliminate path traversal loops and structural complexity (same rationale as pathlib.PurePath).
No direct pathlib.Path / os.PathLike API — MFS paths are virtual and must not be confused with host filesystem paths. Accepting os.PathLike would allow third-party libraries or a plain open() call to silently treat an MFS virtual path as a real OS path, potentially issuing unintended syscalls against the host filesystem. All paths must be plain str with POSIX-style absolute notation (e.g. "/data/file.txt").
No kernel filesystem integration (intentionally in-process only)

Auto-promotion behavior:

By default (default_storage="auto"), new files start as SequentialMemoryFile and auto-promote to RandomAccessMemoryFile when random writes are detected.
Promotion is one-way (no downgrade back to sequential).
Use default_storage="sequential" or "random_access" to fix the backend at construction; use promotion_hard_limit to suppress auto-promotion above a byte threshold.
Storage promotion temporarily doubles memory usage for the promoted file. The quota system accounts for this, but process-level memory may spike briefly.

Security note: In-memory data may be written to physical disk via OS swap or core dumps. MFS does not provide memory-locking (e.g., mlock) or secure erasure. Do not rely on MFS alone for sensitive data isolation.

Exception Reference

Exception	Typical cause
`MFSQuotaExceededError`	write/import/copy would exceed quota
`MFSNodeLimitExceededError`	node count would exceed `max_nodes` (subclass of `MFSQuotaExceededError`)
`FileNotFoundError`	path missing
`FileExistsError`	creation target already exists
`IsADirectoryError`	file operation on directory
`NotADirectoryError`	directory operation on file
`BlockingIOError`	lock timeout or open-file conflict
`io.UnsupportedOperation`	mode mismatch / unsupported operation
`ValueError`	invalid mode/path/seek/truncate arguments

Testing with pytest

D-MemFS ships a pytest plugin that provides an mfs fixture:

# conftest.py — register the plugin explicitly
pytest_plugins = ["dmemfs._pytest_plugin"]

Note: The plugin is not auto-discovered. Users must declare it in conftest.py to opt in.

# test_example.py
def test_write_read(mfs):
    mfs.mkdir("/tmp")
    with mfs.open("/tmp/hello.txt", "wb") as f:
        f.write(b"hello")
    with mfs.open("/tmp/hello.txt", "rb") as f:
        assert f.read() == b"hello"

Development Notes

Design documents (Japanese):

Architecture Spec v13 — API design, internal structure, CI matrix
Detailed Design Spec — component-level design and rationale
Test Design Spec — test case table and pseudocode

These documents are written in Japanese and serve as internal design references.

Performance Summary

Key results from the included benchmark (300 small files × 4 KiB, 16 MiB stream, 2 GiB large stream):

Case	MFS (ms)	BytesIO (ms)	tempfile (ms)
small_files_rw	34	5	164
stream_write_read	64	51	17
random_access_rw	24	53	27
large_stream_write_read	1 438	7 594	1 931
many_files_random_read	777	163	4 745

MFS incurs a small overhead on tiny-file workloads but delivers significantly better performance on large streams and random-access patterns compared with BytesIO. See BENCHMARK.md and benchmark_current_result.md for full data.

Note: tempfile results above were measured with the system temp directory on a RAM disk. On a physical SSD/HDD, tempfile performance will be substantially slower.

License

MIT License

Project details

These details have been verified by PyPI

Project links

GitHub Statistics

Maintainers

nightmarewalker

These details have not been verified by PyPI

Release history Release notifications | RSS feed

0.4.1

Mar 22, 2026

0.4.0

Mar 21, 2026

0.3.0

Mar 8, 2026

0.2.1

Mar 1, 2026

This version

0.2.0

Feb 28, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

d_memfs-0.2.0.tar.gz (21.6 kB view details)

Uploaded Feb 28, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

d_memfs-0.2.0-py3-none-any.whl (24.5 kB view details)

Uploaded Feb 28, 2026 Python 3

File details

Details for the file d_memfs-0.2.0.tar.gz.

File metadata

Download URL: d_memfs-0.2.0.tar.gz
Upload date: Feb 28, 2026
Size: 21.6 kB
Tags: Source
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for d_memfs-0.2.0.tar.gz
Algorithm	Hash digest
SHA256	`6cbe5c384c74eff1258ac80f3da4bd65df5a6841f8a0c52f834088e4dc2d176b`
MD5	`077e48e0d0caadb81882e5420d143c29`
BLAKE2b-256	`6f7249a973f5a348fd586b3153385e62f5153c0a3b10b818c7b3c5e3cee39b19`

See more details on using hashes here.

Provenance

The following attestation bundles were made for d_memfs-0.2.0.tar.gz:

Publisher: publish.yml on nightmarewalker/D-MemFS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: d_memfs-0.2.0.tar.gz
- Subject digest: 6cbe5c384c74eff1258ac80f3da4bd65df5a6841f8a0c52f834088e4dc2d176b
- Sigstore transparency entry: 1005572600
- Sigstore integration time: Feb 28, 2026
Source repository:
- Permalink: nightmarewalker/D-MemFS@f28a8fc0bb4d7f5d13558a22b46af79583e65ee7
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/nightmarewalker
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@f28a8fc0bb4d7f5d13558a22b46af79583e65ee7
- Trigger Event: release

File details

Details for the file d_memfs-0.2.0-py3-none-any.whl.

File metadata

Download URL: d_memfs-0.2.0-py3-none-any.whl
Upload date: Feb 28, 2026
Size: 24.5 kB
Tags: Python 3
Uploaded using Trusted Publishing? Yes
Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for d_memfs-0.2.0-py3-none-any.whl
Algorithm	Hash digest
SHA256	`58fc90d8372fcacc972178a2f493ed4cdedcfba83f6de3ec08eeaec481c0b9d0`
MD5	`800b8052d2b476947ebf04b195396511`
BLAKE2b-256	`9dbd9c328da0d3d176450cf3a9c07d7559988aa8f6107fbf32c086ac9cd532e4`

See more details on using hashes here.

Provenance

The following attestation bundles were made for d_memfs-0.2.0-py3-none-any.whl:

Publisher: publish.yml on nightmarewalker/D-MemFS

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Statement:
- Statement type: https://in-toto.io/Statement/v1
- Predicate type: https://docs.pypi.org/attestations/publish/v1
- Subject name: d_memfs-0.2.0-py3-none-any.whl
- Subject digest: 58fc90d8372fcacc972178a2f493ed4cdedcfba83f6de3ec08eeaec481c0b9d0
- Sigstore transparency entry: 1005572601
- Sigstore integration time: Feb 28, 2026
Source repository:
- Permalink: nightmarewalker/D-MemFS@f28a8fc0bb4d7f5d13558a22b46af79583e65ee7
- Branch / Tag: refs/tags/v0.2.0
- Owner: https://github.com/nightmarewalker
- Access: public
Publication detail:
- Token Issuer: https://token.actions.githubusercontent.com
- Runner Environment: github-hosted
- Publication workflow: publish.yml@f28a8fc0bb4d7f5d13558a22b46af79583e65ee7
- Trigger Event: release

D-MemFS 0.2.0

Navigation

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Project description

D-MemFS

Why MFS?

Installation

Quick Start

API Highlights

MemoryFileSystem

MemoryFileHandle

stat() return (MFSStatResult)

Text Mode

Use Case Tutorials

ETL Staging

Archive-like Operations

SQLite Snapshot

Concurrency and Locking Notes

Async Usage

Benchmarks

Testing and Coverage

API Docs Generation

Compatibility and Non-Goals

Exception Reference

Testing with pytest

Development Notes

Performance Summary

License

Project details

Verified details

Project links

GitHub Statistics

Maintainers

Unverified details

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

Provenance

File details

File metadata

File hashes

Provenance

`MemoryFileSystem`

`MemoryFileHandle`

`stat()` return (`MFSStatResult`)