Skip to main content

Simple on-disk dictionary

Project description

Build Status Coverage Status Version Status Downloads

A dictionary that spills to disk.

Chest acts likes a dictionary but it can write its contents to disk. This is useful in the following two occasions:

  1. Chest can hold datasets that are larger than memory

  2. Chest persists and so can be saved and loaded for later use

LICENSE

New BSD. See License

Install

chest is on the Python Package Index (PyPI):

pip install chest

Example

>>> from chest import Chest
>>> c = Chest()

>>> # Acts like a normal dictionary
>>> c['x'] = [1, 2, 3]
>>> c['x']
[1, 2, 3]

>>> # Data persists to local files
>>> c.flush()
>>> import os
>>> os.listdir(c.path)
['.keys', 'x']

>>> # These files hold pickled results
>>> import pickle
>>> pickle.load(open(c.path))
[1, 2, 3]

>>> # Though one normally accesses these files with chest itself
>>> c2 = Chest(path=c.path)
>>> c2.keys()
['x']
>>> c2['x']
[1, 2, 3]

>>> # Chest is configurable, so one can use json instead of pickle
>>> import json
>>> c = Chest(path='my-chest', dump=json.dump, load=json.load)
>>> c['x'] = [1, 2, 3]
>>> c.flush()

>>> json.load(open('my-chest'))
[1, 2, 3]

Known Failings

Chest was designed to hold a moderate amount of largish numpy arrays. It doesn’t handle the very many small key-value pairs usecase (though could with small effort). In particular chest has the following deficiencies

  1. It determines what values to spill to disk by size. The largest values are spilled. This can be improved by better determination of size (see the nbytes function) and consideration of time-of-use (like an LRU mechanism.)

  2. Spill conditions are checked after every action. Spill conditions often involve an n log(n) sorting process. This could be improved by tracking and efficiently updating the size of all values iteratively.

  3. Chest is not multi-process safe. We should institute a file lock at least around the .keys file.

  4. Chest does not support mutation of variables on disk.

Dependencies

Chest supports Python 2.6+ and Python 3.2+ with a common codebase. It is pure Python and requires no dependencies beyond the standard library.

It is, in short, a light weight dependency.

Author

Chest was originally created by Matthew Rocklin

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

chest-0.1.0.tar.gz (4.6 kB view details)

Uploaded Source

File details

Details for the file chest-0.1.0.tar.gz.

File metadata

  • Download URL: chest-0.1.0.tar.gz
  • Upload date:
  • Size: 4.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for chest-0.1.0.tar.gz
Algorithm Hash digest
SHA256 ede42695b757d48bff51a8b68aecaf27e9b4cb3968eafc879609f58d098d5d92
MD5 8adca786f96e78b5b61c343660216c38
BLAKE2b-256 8f633efa3123b18623767997710b2cc4896279e509b9c6d9baf20dc5a988a615

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page