Skip to main content

Easily dump python objects to files, and then load them back.

Project description

Vlermv
==================
Vlermv makes it easy to save Python
objects to files with meaningful identifiers.
The package Vlermv provides two interfaces.

Vlermv
``vlermv.Vlermv`` is a dictionary interface.
Vlermv Cache
``vlermv.cache`` is a decorator that you can use for caching the output of a function.

Using Vlermv
-------------------
Vlermv provides a dictionary-like object
that is associated with a particular directory on
your computer. ::

from vlermv import Vlermv
vlermv = Vlermv('/tmp/a-directory')

The keys correspond to files, and the values get
pickled to the files. ::

vlermv['filename'] = range(100)

import pickle
range(100) == pickle.load(open('/tmp/a-directory/filename', 'rb'))

You can also read and delete things. ::

# Read
range(100) == vlermv['filename']

# Delete
del(vlermv['filename'])

The coolest part is that the key gets interpreted
in a fancy way. Aside from strings and string-like objects,
you can use iterables of strings; all of these indices refer
to the file ``/tmp/a-directory/foo/bar/baz``::

vlermv[('foo','bar','baz')]
vlermv[['foo','bar','baz']]

If you pass a relative path to a file, it will be broken up as you'd expect;
that is, strings get split on slashes and backslashes. ::

vlermv['foo/bar/baz']
vlermv['foo\\bar\\baz']

Note well: Specifying an absolute path won't save things outside the vlermv directory. ::

vlermv['/foo/bar/baz'] # -> foo, bar, baz
vlermv['C:\\foo\\bar\\baz'] # -> c, foo, bar, baz
# (lowercase "c")

If you pass a URL, it will also get broken up in a reasonable way. ::

# /tmp/a-directory/http/thomaslevine.com/!/?foo=bar#baz
vlermv['http://thomaslevine.com/!/?foo=bar#baz']

# /tmp/a-directory/thomaslevine.com/!?foo=bar#baz
vlermv['thomaslevine.com/!?foo=bar#baz']

Dates and datetimes get converted to :code:`YYYY-MM-DD` format. ::

import datetime

# /tmp/a-directory/2014-02-26
vlermv[datetime.date(2014,2,26)]
vlermv[datetime.datetime(2014,2,26,13,6,42)]

And you can mix these formats! ::

# /tmp/a-directory/http/thomaslevine.com/open-data/2014-02-26
vlermv[('http://thomaslevine.com/open-data', datetime.date(2014,2,26))]

It also has typical dictionary methods like :code:`keys`, :code:`values`, :code:`items`,
and :code:`update`.

Using Vlermv Cache
-------------------
A function receives input, does something, and then returns output.
If you decorate a function with Vlermv Cache, it caches the output;
if you call the function again with the same input, it loads the
output from the cache instead of doing what it would normally do.

The simplest usage is to decorate the function with ``@vlermv.cache()``.
For example, ::

@vlermv.cache()
def is_prime(number):
for n in range(2, number):
if number % n == 0:
return False
return True

Now you can call ``is_prime`` as if it's a normal function, and
if you call it twice, the second call will load from the cache.

Some fancier uses are discussed below.

Non-default directory
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you pass no arguments to cache, as in the example above,
the cache will be stored in a directory named after the function.
To set a different directory, pass it as an argument. ::

@vlermv.cache('~/.primes')
def is_prime(number):
for n in range(2, number):
if number % n == 0:
return False
return True

I recommend storing your caches in dotted directories under your
home directory, as you see above.

Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The kwargs get passed to ``vlermv.Vlermv``, so you
can do fun things like changing the serialization function. ::

@vlermv.cache('~/.http', serializer = vlermv.serializers.identity)
def get(url):
return requests.get(url).text

Read more about the keyword arguments in the Vlermv section above.

Non-identifying arguments
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you want to pass an argument but not use it as an identifier,
pass a non-keyword argument; those get passed along to the function
but don't form the identifier. For example, ::

@vlermv.cache('~/.http')
def get(url, auth = None):
return requests.get(url, auth = auth)

get('http://this.website.com', auth = ('username', 'password')

Refreshing the cache
~~~~~~~~~~~~~~~~~~~~~~~~~~
I find that I sometimes want to refresh the cache for a particular
file, only. This is usually because an error occurred and I have fixed
the error. You can delete the cache like this.

@vlermv.cache()
def is_prime(number):
for n in range(2, number):
if number % n == 0:
return False
return True

is_prime(100)
del(is_prime[100])

Vlermv Cache has all of Vlermv's features
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The above method for refreshing the cache works because ``is_prime``
isn't really a function; it's actually a ``VlermvCache`` object, which
is a sub-class of ``Vlermv``. Thus, you can use it in all of the ways
that you can use ``Vlermv``.

@vlermv.cache()
def f(x, y):
return x + y

print(f(3,4))
# 7

print(list(f.keys()))
# ['3/4']

You can even set the value to be something weird.

f[('a', 8)] = None, {'key':'value'}
print(f('a', 8))
# 0

Each value in ``f`` is a tuple of the error and the actual value.
Exactly one of these is always ``None``. If the error is None, the
value is returned, and if the value is None, the error is raised.

Better than Mongo
---------------------
Vlermv is nearly better than Mongo, so you should use it anywhere
where you were previously using Mongo. Vlermv is designed for
write-heavy workloads that need scalability (easy sharding), flexible
schemas, and highly configurable indexing.

Things that are missing for a full Mongo replacement

* Protection against inode exhaustion
* Ability to delete an entire directory for atomic edits within a document
* Transactions (Mongo `doesn't have them <http://docs.mongodb.org/manual/tutorial/perform-two-phase-commits/>`_, but they would be cool.)
* Indices maybe? In case you want an index on something other than the filename

ACID properties
----------------------------
Atomicity
Writes are made to a temporary file that gets renamed.
Consistency
No validation is supported, so the database is always consistent by definition.
Isolation
Vlermv has isolation within files/documents/values but not across. You may implement your own multi-file transactions.
Durability
All data are saved to disk right away.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vlermv-0.2.1.tar.gz (6.1 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page