Skip to main content

(cloudpickle+lz4 based) save Python objects in binary to both the `file system` and `virtual disk in ram` and manage them.

Project description

Files3 - Python Object File System

  1. English Version
    1. Overview
    2. Installation
    3. Quick Start
    4. Use Cases
    5. Advanced Usage
    6. In-Memory Backend
    7. Embedded Packaging
    8. CLI Commands
    9. Notice
  2. Chinese Version
    1. 概述
    2. 安装
    3. 快速开始
    4. 应用场景
    5. 高级用法
    6. 内存后端
    7. 嵌入式打包
    8. 命令行工具
    9. 注意事项

Overview

A Windows-native Python object persistence library. Save any Python object to the file system with a dict-like interface. Built on cloudpickle + lz4 compression.

When to Use

ScenarioWhy Files3
Config persistenceStore complex Python objects (custom classes, lambdas, closures) instead of JSON/YAML
Local cacheCache function results or intermediate states without a database
Data exchangePass Python objects between processes/scripts via the file system
Embedded packagingPack resources into a .py file with packpy for distribution
Experiment snapshotsQuickly save model weights, parameters, states mid-experiment

Core Advantages

  • Any object storage: cloudpickle handles lambdas, closures, local classes, module references
  • Fast compression: lz4 enabled by default, balancing space and speed
  • Source code relinking: objects defined in __main__ are auto-relinked on load even after script rename/move
  • Dict-like API: f['key'], f.key, f.set() all work
  • Sub-key support: one primary key can expand into multiple sub-keys, auto-converting to a folder
  • Dual backend: file system (F3Shell) or shared memory (F3Mem) with identical APIs

Not Recommended For

  • Cross-platform data exchange (Windows only)
  • High-concurrency write scenarios (no locking, relies on filesystem atomicity)
  • Massive key-value stores (hundreds of thousands+ keys, filesystem inode bottleneck)
  • Complex querying requiring SQL-like search

Installation

pip install files3

After installation, associate a file extension with the f3open viewer:

f3assoc .ist

Quick Start

from files3 import files
<p>f = files('./data')  # workspace directory, default suffix '.ist'</p>
<h1>Save</h1>
<p>f.set('model', {'weights': [0.1, 0.2], 'epoch': 10})</p>
<h1>Load</h1>
<p>print(f.get('model'))  # {'weights': [0.1, 0.2], 'epoch': 10}</p>
<h1>Check</h1>
<p>print(f.has('model'))  # True</p>
<h1>Delete</h1>
<p>f.delete('model')

Use Cases

1. Config Persistence

from files3 import files
<p>f = files('./config')</p>
<h1>Save a complex config with custom classes</h1>
<p>f['app_cfg'] = {
'lr_scheduler': lambda epoch: 0.1 ** (epoch // 10),  # lambda is fine
'model_cls': MyModel,  # class reference is fine
'layers': [64, 128, 256],
}</p>
<h1>Load it later (even in another script)</h1>
<p>cfg = f['app_cfg']

2. Function Result Cache

from files3 import files
<p>f = files('./cache')</p>
<p>def expensive_compute(x):
key = f'compute_{x}'
if f.has(key):
return f[key]
result = sum(i ** 2 for i in range(x))
f[key] = result
return result

3. Data Exchange Between Processes

# script_a.py
from files3 import files
f = files('./shared')
f['model'] = trained_model
<h1>script_b.py</h1>
<p>from files3 import files
f = files('./shared')
model = f['model']

4. Batch Operations with Filter Syntax

import re
<p>f = files('./data')</p>
<h1>Delete all keys starting with 'temp_'</h1>
<p>del f[re.compile(r'^temp_')]</p>
<h1>Set multiple keys at once</h1>
<p>f['a', 'b', 'c'] = 100</p>
<h1>Delete by custom filter</h1>
<p>del f[lambda name, ftype: name.startswith('old_')]</p>
<h1>Clear everything</h1>
<p>del f[...]

Advanced Usage

Dict-Style Access

f = files('./data')
<h1>Write</h1>
<p>f.a = 1          # same as f.set('a', 1, error=True)
f['b'] = 2       # same as above
f['c', 'data'] = [1, 2, 3]  # sub-key</p>
<h1>Read</h1>
<p>print(f.a)       # same as f.get('a', error=False)
print(f['b'])    # same as f.get('b', error=True)</p>
<h1>Delete</h1>
<p>del f.a
del f['b']</p>
<h1>Check</h1>
<p>'a' in f         # same as f.has('a', error=True)
len(f)           # count of primary keys

Sub-Keys

One primary key can hold multiple sub-keys. The primary key automatically becomes a folder.

f = files('./data')
f.set('user', {'name': 'alice'})           # user.ist (file)
f.set('user', {'age': 30}, skey='age')      # user.ist/ (folder)
                                            #   _.ist   original content
                                            #   age.ist new content
<p>f['user', '<em>']     # read default sub-key
f['user', 'age']   # read age sub-key
f.list('user')     # ['</em>', 'age']

Serialization Tools

from files3 import files
<h1>Serialize to bytes</h1>
<p>b = files.dumps({'data': [1, 2, 3]})
obj = files.loads(b)</p>
<h1>Pack a file/directory into bytes</h1>
<p>b = files.pack(r'C:\my_resource')
files.unpack(b, r'C:\extract_to')</p>
<h1>Pack into a Python module (no files3 dependency to unpack)</h1>
<p>code = files.packpy(r'C:\my_resource')
with open('resource.py', 'w') as fh:
fh.write(code)</p>
<h1>Unpack from the module</h1>
<p>from resource import F3DATA
files.unpackpy(F3DATA, r'C:\extract_to')

In-Memory Backend (F3Mem)

Zero disk IO, cross-process sharing, lost on reboot. Identical API to files.

from files3 import memfiles
<p>m = memfiles('my_ns')
m['key'] = {'speed': 'fast'}
m['key', 'sub'] = 'zero_disk_io'</p>
<h1>Persist to disk</h1>
<p>m.save('./backup')</p>
<h1>Load from disk</h1>
<p>m2 = memfiles('another_ns')
m2.load('./backup')</p>
<h1>Cleanup</h1>
<p>m.clear()

Featurefiles (F3Shell)memfiles (F3Mem)
BackendFile systemOS shared memory
PersistentYesNo (lost on reboot)
Cross-processVia filesystemDirect share (zero-copy)
Disk IOYesNone

Embedded Packaging

Pack any file or directory into a .py file so it can travel with your source code. No database or extra dependency needed on the receiving end.

Pack to a .py file

from files3 import prefab
<h1>Pack C:\my_data (file or folder) into C:\my_data.py</h1>
<p>prefab.aspy(r'C:\my_data')

What happens:

  1. Zip-compresses the target.
  2. Converts the zip bytes into a Python bytes literal.
  3. Writes the literal into C:\my_data.py as variable F3DATA.

Auto-extract on first run

Use astarget in your script. If the target is missing, it extracts from the adjacent .py file automatically.

from files3 import prefab
<h1>If C:\my_data does not exist, extract from C:\my_data.py</h1>
<p>prefab.astarget(r'C:\my_data')

Low-level API

If you need the code string in memory instead of a file:

from files3 import files
<h1>Returns a Python code string (contains F3DATA variable)</h1>
<p>code = files.packpy(r'C:\my_data')</p>
<h1>Later, feed the string back to unpack</h1>
<p>files.unpackpy(code, r'C:\extract_to')

CLI Commands

f3 [name] [type] -d [dir]   # open a files3 object interactively
f3open [filepath]             # open a single .ist file
f3assoc [type]                # associate file extension with f3open
f3unassoc [type]              # remove file association

Notice

  • Security: pickle is not safe. Do not loads() data from untrusted sources.
  • Cannot save: F3Bool instances, F3Shell instances, active exception objects, generators, open file handles, and some C-extension objects.
  • Windows only: Relies on Win32 APIs for file associations and folder icons.
  • ModuleNotFoundError on load: If the source script was moved/renamed, use f.relink(new_path, 'key') to fix.

概述

Windows原生Python对象持久化库。以文件系统为后端,将任意Python对象序列化存储,提供类字典的交互接口。基于 cloudpickle + lz4 压缩。

适用场景

场景说明
配置持久化替代json/yaml存储复杂Python数据结构(含自定义类、lambda、闭包)
本地缓存缓存函数计算结果、中间状态,支持任意可序列化对象
数据交换通过文件系统在不同进程/脚本间传递Python对象
嵌入式打包将资源文件打包为Python代码(packpy/unpackpy),随代码分发
实验数据保存快速保存实验中间结果(模型、参数、状态),无需配置数据库

核心优势

  • 任意对象存储:基于cloudpickle,支持lambda、闭包、局部类、模块引用等标准pickle无法处理的类型
  • lz4压缩:默认启用高速压缩,平衡存储空间和读写性能
  • 源代码重链接:保存__main__中定义的类/函数时,自动记录源文件路径,加载时自动修正模块名
  • 类字典接口:支持f['key']f.keyf.set()等多种交互方式
  • 子键支持:单主键可扩展为多子键,主键自动转为文件夹管理
  • 双后端:文件系统(F3Shell)或共享内存(F3Mem),API完全一致

不推荐的场景

  • 跨平台数据交换(仅支持Windows)
  • 高并发写入场景(无锁机制,依赖文件系统原子性)
  • 超大规模键值存储(数十万级以上,文件系统inode成为瓶颈)
  • 需要SQL查询的复杂检索场景

安装

pip install files3

安装后,可在cmd中将文件后缀关联到f3open查看器:

f3assoc .ist

快速开始

from files3 import files
<p>f = files('./data')  # 工作目录,默认后缀 '.ist'</p>
<h1>保存</h1>
<p>f.set('model', {'weights': [0.1, 0.2], 'epoch': 10})</p>
<h1>读取</h1>
<p>print(f.get('model'))  # {'weights': [0.1, 0.2], 'epoch': 10}</p>
<h1>检查</h1>
<p>print(f.has('model'))  # True</p>
<h1>删除</h1>
<p>f.delete('model')

应用场景

1. 配置持久化

from files3 import files
<p>f = files('./config')</p>
<h1>保存含自定义类的复杂配置</h1>
<p>f['app_cfg'] = {
'lr_scheduler': lambda epoch: 0.1 ** (epoch // 10),  # lambda没问题
'model_cls': MyModel,  # 类引用没问题
'layers': [64, 128, 256],
}</p>
<h1>之后读取甚至另一个脚本</h1>
<p>cfg = f['app_cfg']

2. 函数结果缓存

from files3 import files
<p>f = files('./cache')</p>
<p>def expensive_compute(x):
key = f'compute_{x}'
if f.has(key):
return f[key]
result = sum(i ** 2 for i in range(x))
f[key] = result
return result

3. 跨进程数据交换

# script_a.py
from files3 import files
f = files('./shared')
f['model'] = trained_model
<h1>script_b.py</h1>
<p>from files3 import files
f = files('./shared')
model = f['model']

4. 批量筛选操作

import re
<p>f = files('./data')</p>
<h1>删除所有以'temp_'开头的键</h1>
<p>del f[re.compile(r'^temp_')]</p>
<h1>同时设置多个键</h1>
<p>f['a', 'b', 'c'] = 100</p>
<h1>自定义条件删除</h1>
<p>del f[lambda name, ftype: name.startswith('old_')]</p>
<h1>清空全部</h1>
<p>del f[...]

高级用法

字典式访问

f = files('./data')
<h1>写入</h1>
<p>f.a = 1          # 等价 f.set('a', 1, error=True)
f['b'] = 2       # 等价同上
f['c', 'data'] = [1, 2, 3]  # 子键</p>
<h1>读取</h1>
<p>print(f.a)       # 等价 f.get('a', error=False)
print(f['b'])    # 等价 f.get('b', error=True)</p>
<h1>删除</h1>
<p>del f.a
del f['b']</p>
<h1>检查</h1>
<p>'a' in f         # 等价 f.has('a', error=True)
len(f)           # 主键数量

子键

一个主键下可存储多个子键,主键自动变为文件夹。

f = files('./data')
f.set('user', {'name': 'alice'})           # user.ist(文件)
f.set('user', {'age': 30}, skey='age')      # user.ist/(文件夹)
                                            #   _.ist   原内容
                                            #   age.ist 新内容
<p>f['user', '<em>']     # 读取默认子键
f['user', 'age']   # 读取age子键
f.list('user')     # ['</em>', 'age']

序列化工具

from files3 import files
<h1>序列化为bytes</h1>
<p>b = files.dumps({'data': [1, 2, 3]})
obj = files.loads(b)</p>
<h1>将文件/目录打包为bytes</h1>
<p>b = files.pack(r'C:\my_resource')
files.unpack(b, r'C:\extract_to')</p>
<h1>打包为Python模块解压时无需安装files3</h1>
<p>code = files.packpy(r'C:\my_resource')
with open('resource.py', 'w') as fh:
fh.write(code)</p>
<h1>从模块解压</h1>
<p>from resource import F3DATA
files.unpackpy(F3DATA, r'C:\extract_to')

内存后端(F3Mem)

零磁盘IO、跨进程共享、重启丢失。接口与files完全一致。

from files3 import memfiles
<p>m = memfiles('my_ns')
m['key'] = {'speed': 'fast'}
m['key', 'sub'] = 'zero_disk_io'</p>
<h1>持久化到磁盘</h1>
<p>m.save('./backup')</p>
<h1>从磁盘加载</h1>
<p>m2 = memfiles('another_ns')
m2.load('./backup')</p>
<h1>清理</h1>
<p>m.clear()

特性files (F3Shell)memfiles (F3Mem)
后端文件系统OS共享内存
持久化是(磁盘持久)否(重启丢失)
跨进程通过文件系统直接共享(零拷贝)
磁盘IO

嵌入式打包

将任意文件或目录打包成 .py 文件,随源代码一起分发。接收端无需数据库或其他依赖。

打包为 .py 文件

from files3 import prefab
<h1> C:\my_data文件或目录打包为 C:\my_data.py</h1>
<p>prefab.aspy(r'C:\my_data')

内部流程:

  1. 将目标 zip 压缩。
  2. 将 zip 字节流转换为 Python bytes 字面量。
  3. 写入 C:\my_data.py,变量名为 F3DATA

首次运行时自动解压

在脚本中使用 astarget。如果目标不存在,自动从相邻的 .py 文件解压。

from files3 import prefab
<h1> C:\my_data 不存在则从 C:\my_data.py 解压</h1>
<p>prefab.astarget(r'C:\my_data')

底层 API

如果需要字符串而非直接生成文件:

from files3 import files
<h1>返回 Python 代码字符串内含 F3DATA 变量</h1>
<p>code = files.packpy(r'C:\my_data')</p>
<h1>后续传入字符串解压</h1>
<p>files.unpackpy(code, r'C:\extract_to')

命令行工具

f3 [name] [type] -d [dir]   # 交互式打开files3对象
f3open [filepath]             # 打开单个.ist文件
f3assoc [type]                # 关联文件后缀到f3open
f3unassoc [type]              # 移除文件关联

注意事项

  • 安全性:pickle不安全,不要对不信任的数据调用loads()
  • 无法保存F3Bool实例、F3Shell实例、活动异常对象、生成器、打开的文件句柄、某些C扩展对象。
  • 仅限Windows:依赖Win32 API实现文件关联和文件夹图标设置。
  • 加载时ModuleNotFoundError:如果源脚本被移动/重命名,使用f.relink(new_path, 'key')修复。

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

files3-0.10.0.tar.gz (69.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

files3-0.10.0-py3-none-any.whl (72.6 kB view details)

Uploaded Python 3

File details

Details for the file files3-0.10.0.tar.gz.

File metadata

  • Download URL: files3-0.10.0.tar.gz
  • Upload date:
  • Size: 69.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for files3-0.10.0.tar.gz
Algorithm Hash digest
SHA256 264d48a798861d5eeeec14c15304f3409ad8c05df711622e2f15409c1e7bd842
MD5 3259336f8eb47d79842034f31b37a09e
BLAKE2b-256 57f7a5cc5b147cc09dc337e64f69de99c5fc0bec80e15ee5d83cfb12cbe80f3a

See more details on using hashes here.

File details

Details for the file files3-0.10.0-py3-none-any.whl.

File metadata

  • Download URL: files3-0.10.0-py3-none-any.whl
  • Upload date:
  • Size: 72.6 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for files3-0.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 509d22e24b669c3716f8aa2e8017b9ba723fe11013139e12cfc15cc26a850b4e
MD5 4a82cccad16648bdac1c9558e0dd96fc
BLAKE2b-256 7e9315e8beedbb3c8576320d8a4b37d41328d87b3236468bc8986917221f1a4c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page