Easily dump python objects to files, and then load them back.
Pickle Warehouse makes it easy to save Python objects to files with meaningful identifiers.
How to use
Pickle Warehouse provides a dictionary-like object that is associated with a particular directory on your computer.
from pickle_warehouse import Warehouse warehouse = Warehouse('/tmp/a-directory')
The keys correspond to files, and the values get pickled to the files.
warehouse['filename'] = range(100) import pickle range(100) == pickle.load(open('/tmp/a-directory/filename', 'rb'))
You can also read and delete things.
# Read range(100) == warehouse['filename'] # Delete del(warehouse['filename'])
The coolest part is that the key gets interpreted in a fancy way. Aside from strings and string-like objects, you can use iterables of strings; all of these indices refer to the file /tmp/a-directory/foo/bar/baz:
If you pass a relative path to a file, it will be broken up as you’d expect; that is, strings get split on slashes and backslashes.
Note well: Specifying an absolute path won’t save things outside the warehouse directory.
warehouse['/foo/bar/baz'] # -> foo, bar, baz warehouse['C:\\foo\\bar\\baz'] # -> c, foo, bar, baz # (lowercase "c")
If you pass a URL, it will also get broken up in a reasonable way.
# /tmp/a-directory/http/thomaslevine.com/!/?foo=bar#baz warehouse['http://thomaslevine.com/!/?foo=bar#baz'] # /tmp/a-directory/thomaslevine.com/!?foo=bar#baz warehouse['thomaslevine.com/!?foo=bar#baz']
Dates and datetimes get converted to
import datetime # /tmp/a-directory/2014-02-26 warehouse[datetime.date(2014,2,26)] warehouse[datetime.datetime(2014,2,26,13,6,42)]
And you can mix these formats!
# /tmp/a-directory/http/thomaslevine.com/open-data/2014-02-26 warehouse[('http://thomaslevine.com/open-data', datetime.date(2014,2,26))]
It also has typical dictionary methods like
When to use
pickle-warehouse is for when you want a persistant store of Python objects. If you want an in-memory pickle store, look at _pickleDB: https://pythonhosted.org/pickleDB/.
Pickle Warehouse is strictly better than Mongo, so you should use it anywhere where you were previously using Mongo. Pickle Warehouse is designed for write-heavy workloads that need scalability (easy sharding), traditional database reliability (ACID), flexible schemas, and highly configurable indexing.
Pickle Warehouse is acidic
Here’s how it accomplishes that.
- Writes are made to a temporary file that gets renamed.
- I don’t get this one, but I’m pretty sure I have it.
- Simultaneous writes are handled quite cleanly. If reads occur during writes, an error gets thrown, and you can try again.
- All data are saved to disk right away.
Mongo replacement feature checklist
- Call fsync twice, just to make sure.
- Schema validation on read and write (configurable), because who knows what you did yesterday or whether you change your mind later?
- PID + random number (+ hash?) for random number generation
- Inode exhaustion