Organise all your data in key/value booklets and sync them with S3
Project description
EBooklet
EBooklet is a Python key-value database that syncs with S3 (AWS or any S3-compatible service). It builds on the Booklet package, providing a MutableMapping (dict-like) interface backed by local files and remote S3 storage.
- S3 sync — push/pull changes between a local database and an S3 bucket
- Dict-like API — standard
MutableMappingplusdbm-style methods - Grouped storage — hash keys into N groups stored as single S3 objects, with automatic byte-range reads
- Concurrency — thread-safe writes (thread locks), multiprocessing-safe (file locks), and S3 object locking for remote writes
Keys must be strings (S3 object name requirement). Values can use any serializer supported by Booklet.
Installation
pip install ebooklet
Booklet vs EBooklet
Booklet is a single-file key/value database used as the foundation for EBooklet. Booklet manages local data, while EBooklet manages the interaction between local and remote data. It is best to familiarize yourself with Booklet before using EBooklet.
EBooklet is designed so you can primarily work with Booklet locally, then push to S3 later via EBooklet. If you're actively collaborating with others, open the data using EBooklet to prevent conflicts.
Unlike Booklet which uses fast threading and OS-level file locks, EBooklet uses S3 object locking when opened for writing. This ensures only one process has write access to a remote database at a time, but is slower than local file locks.
Quick Start
Connection setup
Create an S3Connection with your credentials and bucket info:
import ebooklet
remote_conn = ebooklet.S3Connection(
access_key_id='my_key_id',
access_key='my_secret_key',
db_key='big_data.blt',
bucket='my-bucket',
endpoint_url='https://s3.us-west-001.backblazeb2.com', # optional, for non-AWS
db_url='https://my-bucket.org/big_data.blt', # optional, public URL
)
Read-only shortcut
If you only need to read and have a public URL, pass it directly — no S3Connection needed:
db = ebooklet.open('https://my-bucket.org/big_data.blt', '/tmp/big_data.blt', flag='r')
Open, read, write
with ebooklet.open(remote_conn, '/tmp/big_data.blt', flag='c', value_serializer='pickle') as db:
db['key1'] = ['one', 2, 'three', 4]
value = db['key1']
Be careful with flags — using 'n' will delete the remote database in addition to the local one.
Grouped Storage
By default, each key/value pair is stored as a separate S3 object. When num_groups is set, keys are hashed into N groups, each stored as a single S3 object containing all key/value pairs for that bucket.
db = ebooklet.open(remote_conn, '/tmp/big_data.blt', flag='n',
value_serializer='pickle', num_groups=64)
- Keys are assigned to groups via
blake2bhash modnum_groups - Single-key reads use S3 byte-range GET requests to fetch only the needed bytes
- Multi-key reads from the same group use a single merged byte-range GET
- On push, entire affected groups are re-packed and uploaded
- For existing databases,
num_groupsis read from S3 metadata (user-provided value is ignored)
Use grouped storage when you have many small values — it reduces the number of S3 objects and can improve read performance through byte-range requests.
Syncing with S3
The changes() method returns a Change object for inspecting and pushing differences between local and remote:
with ebooklet.open(remote_conn, '/tmp/big_data.blt', 'w') as db:
db['key1'] = 'new value'
changes = db.changes()
for change in changes.iter_changes():
print(change)
changes.push() # upload local changes to S3
Use changes.discard() to remove local changes without pushing, or pass specific keys to discard selectively:
changes.discard() # discard all local changes
changes.discard(['key1']) # discard only key1
Other Methods
| Method | Description |
|---|---|
delete_remote() |
Delete the entire remote database |
copy_remote(remote_conn) |
Copy the remote to another S3 location. Efficient S3-to-S3 copy when credentials match, otherwise downloads then uploads |
load_items(keys=None) |
Download keys/values to the local file without returning them. Pass None to load everything |
get_items(keys) |
Load then return an iterator of (key, value) pairs |
map(func, keys=None, n_workers=None) |
Apply a function to items in parallel using multiprocessing. func(key, value) should return (new_key, new_value) or None to skip |
Remote Connection Groups
Remote connection groups organize and store collections of S3Connection objects. All data from an S3Connection is stored except the access_key and access_key_id. Useful for grouping related or versioned databases together.
They work like a normal EBooklet except they use add instead of set, keys are database UUIDs, and values are dicts of S3Connection parameters plus metadata.
The remote connection must already exist to be added to a group.
remote_conn_rcg = ebooklet.S3Connection(
access_key_id_rcg, access_key_rcg, db_key_rcg, bucket_rcg,
endpoint_url=endpoint_url_rcg,
)
with ebooklet.open(remote_conn_rcg, '/tmp/rcg.blt', 'n', remote_conn_group=True) as rcg:
rcg.add(remote_conn)
changes = rcg.changes()
changes.push()
Open Flags
| Flag | Meaning |
|---|---|
'r' |
Open existing database for reading only (default) |
'w' |
Open existing database for reading and writing |
'c' |
Open database for reading and writing, creating it if it doesn't exist |
'n' |
Always create a new, empty database, open for reading and writing |
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file ebooklet-0.7.0.tar.gz.
File metadata
- Download URL: ebooklet-0.7.0.tar.gz
- Upload date:
- Size: 19.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
f02695a75f26f1de176c512c3d927223925bad922047442cd9e38a99daa9c975
|
|
| MD5 |
44c1071fb3d82c8060543fb7bf54f60f
|
|
| BLAKE2b-256 |
254dc824d0ca4d713b6480664abb408214fee8e8c3bd5719d9f39aaae42bfb11
|
File details
Details for the file ebooklet-0.7.0-py3-none-any.whl.
File metadata
- Download URL: ebooklet-0.7.0-py3-none-any.whl
- Upload date:
- Size: 20.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.8.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e1953ef9a18c0abdb1e7e872734dbceb042f4fedec0dddc839a7699f9f431232
|
|
| MD5 |
6fd8f1796f724c90ac4127faf77792ce
|
|
| BLAKE2b-256 |
e6f1c27cfe8f6cdd45a41a71c70385c2e409b04a57e17966a94d1de49f38331c
|