Skip to main content

A filesystem-in-userspace (FUSE) offering reading/writing of arbitrarily large files

Reason this release was yanked:

Package likely broken due to bundling `libfuse3` which breaks dynamic linking of installed program

Project description

A FUSE file system vending arbitrary volume of reproducible data in a file

This is a FUSE (3) application offering a file system. The file system presents one single file (with the name of your choice, for convenience), backed by a PRNG (by definition deterministic), vending from infinite series of bits. Currently the PRNG is the Splitmix64 algorithm (specifically the next_int variant, under "Basic pseudocode algorithm"). This specific PRNG was chosen for its fast data generation performance, simplicity of implementation, and a sufficient full period. Disclaimer: this isn't a cryptographic application -- the PRNG is not meant to be used as a CSPRNG.

You can think of the purpose of this project as making available a file much like the well-known /dev/random, except a) the file made available is meant to function as a so-called regular file, seekable and allowing random-access -- not e.g. a device [file] that /dev/random is [categorised as], and b) the file produces the same data for the same reading offset, every time -- as if it was a real, storage-backed file containing the specific series of bits. Again: the PRNG does not pass criteria for randomness sufficient for e.g. CSRNG -- for purposes of the file system it's of little consequence how "random" the data are, the only important properties of their distribution are such that minimise chance of accidentally writing wrong data at the right offset by an application being verified that relies on the file system.

Crucially, the file system allows writing in the file that it makes available, with the important property that the data it permits must match the corresponding part of the series, effectively requiring that what is written at an offset is identical to what was read from the same offset. This lends the file system utility in data verification scenarios, which was what prompted me to implement it in the first place.

In context of reading and writing, mounting of the file system allows you to specify the initial size of the file, thus effectively capping the series past some offset at least as initially presented, but the file system does permit writing (appending) data past the end of the file -- again, iff the data attempted written would match what the corresponding portion of the file would contain as defined by the series. Conversely, reading past the end of the file will of course not produce any data (despite the series being infinite in general). Both of these properties follow how a regular file is expected to behave. The --size mounting option (0 by default / omitted/ implied) allows variation on the kind of scenarios the file system may be used in.

Usage

Building

Build the program as per convention, using e.g. GNU Make:

make

This will produce ./mock-large-files-fuse.

Mounting the file system

Mount the filesystem, as per convention, using ./mock-large-files-fuse and a mountpoint of your choice (a path to an existing directory):

./mock-large-files-fuse /mnt/mock-large-files-fuse --filename data

Reading

The file system will "shadow" the path and make available a file named data directly at the mountpoint directory. The file by default is empty -- provide --size to ./mock-large-files-fuse command line like above, with a value, to have an effectively readable file instead. Here's reading the first 100 bytes (or however many available in the file, if there's fewer) in the series and printing them in hexadecimal format:

xxd -l 100 /mnt/mock-large-files-fuse/data

Writing

As explained earlier, writing to the file will fail unless the e.g. bytes you write are the same data that was read at the offset. The following will therefore produce an I/O error:

echo 'Hello world' > /mnt/mock-large-files-fuse/data

The below variant, however, will succeed:

head -c 100 /mnt/mock-large-files-fuse/data > /mnt/mock-large-files-fuse/data

Installing with Python

Although this is a [relatively simple] C application project through and through -- which naturally does not require Python -- because I am planning to at least offer an equivalent written in Python, I thought that presence on PyPi as a package would help with distribution of the software. Building the "wheel" implies and therefore poses the same requirements as for building the program (a C compiler and linker and Make). So does installing the program using PyPi, normally. Installation can be done conventionally:

  1. From PyPi:
pip install mock-large-files-fuse
  1. From a Github repository:
pip install git+ssh@git@github.example.com:owner/mock-large-files-fuse.git

Performance

Clocked 3-4GiB/s reading from a 10TB-sized file on Linux 5.14 (5.14.0-611.16.1.el9_7.x86_64; SMP; PREEMPT) on Intel Core i7-6700. Disclaimer: yes, I am well aware this is nowhere near providing enough detail to make this a true benchmark report, but frankly I have no idea which parts of the machinery are a factor here -- between the power policy configured for the kernel, the Linux distribution (RHEL 9), and the version of libfuse, not to mention a plethora of other perfectly valid candidates. I just want to give you a taste of the order of the performance, that is all.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

mock_large_files_fuse-1.0.0.post1-py3-none-manylinux_2_28_x86_64.whl (91.5 kB view details)

Uploaded Python 3manylinux: glibc 2.28+ x86-64

File details

Details for the file mock_large_files_fuse-1.0.0.post1-py3-none-manylinux_2_28_x86_64.whl.

File metadata

File hashes

Hashes for mock_large_files_fuse-1.0.0.post1-py3-none-manylinux_2_28_x86_64.whl
Algorithm Hash digest
SHA256 251c6ecf8474e70fce3278cb629a81b6e43a286d6e4aff67fa367ef44fe13cac
MD5 86e3bba1c38f72873a1072993a47bb2f
BLAKE2b-256 986a4fa7e24c44ecc9db842d96d121742dec5eee27a8c1de04262faefd9d94f8

See more details on using hashes here.

Provenance

The following attestation bundles were made for mock_large_files_fuse-1.0.0.post1-py3-none-manylinux_2_28_x86_64.whl:

Publisher: release-and-publish.yaml on unioslo/mock-large-files-fuse

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page