Skip to main content

A filesystem-in-userspace (FUSE) offering reading/writing of arbitrarily large files

Project description

A FUSE file system vending arbitrary volume of reproducible data in a file

This is a FUSE (3) application offering a file system. The file system presents one single file (with the name of your choice, for convenience), backed by a PRNG (by definition deterministic), vending from infinite series of bits. Currently the PRNG is the Splitmix64 algorithm (specifically the next_int variant, under "Basic pseudocode algorithm"). This specific PRNG was chosen for its fast data generation performance, simplicity of implementation, and a sufficient full period. Disclaimer: this isn't a cryptographic application -- the PRNG is not meant to be used as a CSPRNG.

You can think of the purpose of this project as making available a file much like the well-known /dev/random, except a) the file made available is meant to function as a so-called regular file, seekable and allowing random-access -- not e.g. a device [file] that /dev/random is [categorised as], and b) the file produces the same data for the same reading offset, every time -- as if it was a real, storage-backed file containing the specific series of bits. Again: the PRNG does not pass criteria for randomness sufficient for e.g. CSRNG -- for purposes of the file system it's of little consequence how "random" the data are, the only important properties of their distribution are such that minimise chance of accidentally writing wrong data at the right offset by an application being verified that relies on the file system.

Crucially, the file system allows writing in the file that it makes available, with the important property that the data it permits must match the corresponding part of the series, effectively requiring that what is written at an offset is identical to what was read from the same offset. This lends the file system utility in data verification scenarios, which was what prompted me to implement it in the first place.

In context of reading and writing, mounting of the file system allows you to specify the initial size of the file, thus effectively capping the series past some offset at least as initially presented, but the file system does permit writing (appending) data past the end of the file -- again, iff the data attempted written would match what the corresponding portion of the file would contain as defined by the series. Conversely, reading past the end of the file will of course not produce any data (despite the series being infinite in general). Both of these properties follow how a regular file is expected to behave. The --size mounting option (0 by default / omitted/ implied) allows variation on the kind of scenarios the file system may be used in.

Usage

Building

Build the program as per convention, using e.g. GNU Make:

make

This will produce ./mock-large-files-fuse.

Mounting the file system

Mount the filesystem, as per convention, using ./mock-large-files-fuse and a mountpoint of your choice (a path to an existing directory):

./mock-large-files-fuse /mnt/mock-large-files-fuse --filename data

Reading

The file system will "shadow" the path and make available a file named data directly at the mountpoint directory. The file by default is empty -- provide --size to ./mock-large-files-fuse command line like above, with a value, to have an effectively readable file instead. Here's reading the first 100 bytes (or however many available in the file, if there's fewer) in the series and printing them in hexadecimal format:

xxd -l 100 /mnt/mock-large-files-fuse/data

Writing

As explained earlier, writing to the file will fail unless the e.g. bytes you write are the same data that was read at the offset. The following will therefore produce an I/O error:

echo 'Hello world' > /mnt/mock-large-files-fuse/data

The below variant, however, will succeed:

head -c 100 /mnt/mock-large-files-fuse/data > /mnt/mock-large-files-fuse/data

Installing with Python

Although this is a [relatively simple] C application project through and through -- which naturally does not require Python -- because I am planning to at least offer an equivalent written in Python, I thought that presence on PyPi as a package would help with distribution of the software. Building the "wheel" implies and therefore poses the same requirements as for building the program (a C compiler and linker and Make). So does installing the program using PyPi, normally. Installation can be done conventionally:

  1. From PyPi:
pip install mock-large-files-fuse
  1. From a Github repository:
pip install git+ssh@git@github.example.com:owner/mock-large-files-fuse.git

Performance

Clocked 3-4GiB/s reading from a 10TB-sized file on Linux 5.14 (5.14.0-611.16.1.el9_7.x86_64; SMP; PREEMPT) on Intel Core i7-6700. Disclaimer: yes, I am well aware this is nowhere near providing enough detail to make this a true benchmark report, but frankly I have no idea which parts of the machinery are a factor here -- between the power policy configured for the kernel, the Linux distribution (RHEL 9), and the version of libfuse, not to mention a plethora of other perfectly valid candidates. I just want to give you a taste of the order of the performance, that is all.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mock_large_files_fuse-1.0.0.post4.tar.gz (9.2 kB view details)

Uploaded Source

Built Distributions

If you're not sure about the file name format, learn more about wheel file names.

mock_large_files_fuse-1.0.0.post4-py3-none-musllinux_1_2_x86_64.whl (10.8 kB view details)

Uploaded Python 3musllinux: musl 1.2+ x86-64

mock_large_files_fuse-1.0.0.post4-py3-none-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl (10.5 kB view details)

Uploaded Python 3manylinux: glibc 2.28+ x86-64manylinux: glibc 2.5+ x86-64

File details

Details for the file mock_large_files_fuse-1.0.0.post4.tar.gz.

File metadata

File hashes

Hashes for mock_large_files_fuse-1.0.0.post4.tar.gz
Algorithm Hash digest
SHA256 c97a33af752ec90049f0edcdde14004b56387d07d74f5ace2e4803bfe765aba8
MD5 bdab7816951c7c67a5195ca843918057
BLAKE2b-256 621f3a9070f87f2f0d167a3604043ab37797b6d153f27a9f64d5a949df2c8884

See more details on using hashes here.

Provenance

The following attestation bundles were made for mock_large_files_fuse-1.0.0.post4.tar.gz:

Publisher: release-and-publish.yaml on unioslo/mock-large-files-fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mock_large_files_fuse-1.0.0.post4-py3-none-musllinux_1_2_x86_64.whl.

File metadata

File hashes

Hashes for mock_large_files_fuse-1.0.0.post4-py3-none-musllinux_1_2_x86_64.whl
Algorithm Hash digest
SHA256 073be785b3611991f7349b8ae165258a7bb8e64a998791b3d1320e9b5b1de19e
MD5 fea9a3d272748e5815df311f19735f73
BLAKE2b-256 b9577f09e3b9f4746b9505bf92f7d34ce7b38bababb5f2aff3f0ac5868b90a01

See more details on using hashes here.

Provenance

The following attestation bundles were made for mock_large_files_fuse-1.0.0.post4-py3-none-musllinux_1_2_x86_64.whl:

Publisher: release-and-publish.yaml on unioslo/mock-large-files-fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file mock_large_files_fuse-1.0.0.post4-py3-none-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl.

File metadata

File hashes

Hashes for mock_large_files_fuse-1.0.0.post4-py3-none-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl
Algorithm Hash digest
SHA256 6dfb4830f73569b724e3f21e1c7ed1a8eb0e7aa3865dbc3dfa02bd540b827917
MD5 729bbc4cdc0ae0a1b460c4b7594f0975
BLAKE2b-256 a4a99b380c92a8f6114bd4bd178814108bda65665307e06689032227ba52ec23

See more details on using hashes here.

Provenance

The following attestation bundles were made for mock_large_files_fuse-1.0.0.post4-py3-none-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl:

Publisher: release-and-publish.yaml on unioslo/mock-large-files-fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page