Skip to main content

Full-featured Python dict interface to the LMDB "Lightning" Database.

Project description

lmdb-dict-full

PyPI - Version PyPI - Python Version

The full-featured dict interface to the LMDB "Lightning" Database.

  • Internally optimized via lmdb library cursors. Optional LRU caching of deserialized values. Thread-safe operations. No added reserved keys, etc.

  • Provides value-serializing SafeLmdbDict and str-only StrLmdbDict, as well as abstract base class LmdbDict for customization of database encoding.

  • Unique-key, labeled and unlabeled databases and read-write sessions supported.


Table of Contents

Installation

pip install lmdb-dict-full

Use

General use

SafeLmdbDict provides the full dict interface to a LMDB database at a given filesystem path. (An empty database is automatically provisioned within a directory without one.)

Values are automatically serialized (deserialized) and compressed (decompressed) using PyYAML and zlib.

from lmdb_dict import SafeLmdbDict

dbdict = SafeLmdbDict('/path/to/db/directory/0/')

dbdict['aaa'] = {'values': [0, 1, 'x']}

One or more named databases are also supported.

LMDB requires that the maximum number of named databases is specified up-front. Below we'll only need two named databases.

users = SafeLmdbDict('/path/to/db/directory/1/', name='users', max_dbs=2)

hats = SafeLmdbDict('/path/to/db/directory/1/', name='hats', max_dbs=2)

Note that it would otherwise be unsafe to hold open multiple lmdb client objects within a single process at once. This is handled automatically: a weak reference is kept to the client opened for each filesystem path and reused for each LmdbDict requiring it.

Caching

Caching of LMDB itself should not be necessary. The database "fully exploits the operating system’s buffer cache" and memory mapping [ref].

Moreover, lmdb-dict-full makes every effort to use lmdb efficiently, such that the user need not be concerned with undue overhead of interacting with the database-backed dictionary.

That said: the value serialization layer of SafeLmdbDict is another matter. Given sufficiently hefty values to deserialize, it may be worthwhile to engage the lmdb-dict-full caching layer, along with the trade-offs that it entails.

Caveats

lmdb-dict-full caching is thread-safe

This is achieved with behind-the-scenes locking – narrowly applied to singular keys where feasible – but the small overhead of which applies when caching.

lmdb-dict-full caching is not (yet) automatically process-safe

Caching is thread-safe thanks to thread locks and (again) weak references to caches which must be shared across dictionaries backed by the same databases.

Achieving the same under a multiprocessing regime would be another matter.

Users may nonetheless make use of lmdb-dict-full while multiprocessing, either without caching or with thoughtful application of caches across processes.

Options

Caching is built into all concrete subclasses of LmdbDict; however, it is disabled by default, in that it is set to DummyCache – a mapping capable of storing zero items.

Subclasses of LmdbDict check their cache for its maximum capacity by means of: getattr(cache, 'maxsize', …). A cache reporting maxsize=0 – such as the DummyCache – will be given dummy locks, such that locking is disabled for this dictionary.

A cache reporting any other maxsize – or lacking this property – is treated as a proper cache, and locking will be applied.

Caching may be specified – to SafeLmdbDict for example – via an instance, a class, or any callable returning an instance of a mapping for use as a deserialization cache. Either an instance or a class are strongly recommended, as these enable checking any cache retrieved from the weak reference registry against the user's instantiation argument.

from lmdb_dict.cache import LRUCache128

SafeLmdbDict('/path/to/db/directory/', cache=LRUCache128)

Above, we've specified that our SafeLmdbDict should cache deserialized values using an instance of LRUCache128 – that is, a subclass of the LRUCache provided by cachetools. LRUCache128 distinguishes itself only in that it requires no initialization arguments – a requirement of supplying a callable in lieu of a cache instance – and it sets maxsize=128.

As a shortcut to the above, lmdb-dict-full provides CachedLmdbDict:

from lmdb_dict import CachedLmdbDict

CachedLmdbDict('/path/to/db/directory/')

CachedLmdbDict differs from other subclasses of LmdbDict in that it defaults to caching via LRUCache128. Other caches may be specified via the cache argument. Supplying an entity with property maxsize=0 – such as the DummyCache – will raise a TypeError.

Str-only

The above concrete subclasses of LmdbDict support arbitrary serializable values in order to best mimic the functionality of the Python dict.

For use-cases supporting str-only (and/or bytes-only) values, all of the above concerns over serialization, caching and locking may be sidestepped.

StrLmdbDict provides the same full-featured dict interface to LMDB, but only for values of type str and bytes.

from lmdb_dict import StrLmdbDict

StrLmdbDict('/path/to/db/directory/')

StrLmdbDict further differs from other subclasses of LmdbDict in that it accepts no cache argument, and may not perform caching.

License

lmdb-dict-full is distributed under the terms of the MIT license.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lmdb_dict_full-1.0.2.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

lmdb_dict_full-1.0.2-py3-none-any.whl (18.5 kB view details)

Uploaded Python 3

File details

Details for the file lmdb_dict_full-1.0.2.tar.gz.

File metadata

  • Download URL: lmdb_dict_full-1.0.2.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: python-httpx/0.23.3

File hashes

Hashes for lmdb_dict_full-1.0.2.tar.gz
Algorithm Hash digest
SHA256 9bd14a30ab3667e3d9e46707c31e9c62dbeb88ba72a30122beefd5635695f92c
MD5 bc77f82de121fccc5119357a8a6185e1
BLAKE2b-256 4d90bad2239ee964b402e587251a5aa02c75967a98b608a1c79d47fec804778f

See more details on using hashes here.

File details

Details for the file lmdb_dict_full-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for lmdb_dict_full-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 58f34190f8eda8415dac8c0f4a7868597f190281ccd738edecbccb7c3e53852b
MD5 bd2a51effd7e2ab9d6894e1b6d2ba612
BLAKE2b-256 698ba59f0f74e7e7948d099cf0c9f4917a39ee3460b734e00bddd193386952c8

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page