ZODB Persistence in a Google Protocol Buffer
Project description
Overview
The keas.pbpersist package provides a way to store ZODB objects using Google Protocol Buffer encoding rather than pickling. It requires the keas.pbstate package. The Protocol Buffer format is more limited than pickles, but it is also simpler, better documented, and not dependent on any particular programming language.
Only Persistent objects that use the ProtobufState metaclass (from the keas.pbstate.meta module) are eligible for this kind of serialization, so applications that want to take advantage of keas.pbpersist need to be refactored at the model level. However, applications can be refactored gradually, since protobuf encoded objects and pickled objects can coexist and hold references to each other within a single database.
This package is expected to be fully compatible with most ZODB storages, including FileStorage, ZEO, RelStorage, and any other storage that treats object records as opaque binary streams.
Note that this package requires a patch to ZODB; see the “-polling-serial” patched version at the following site:
How to Use This Package
The documentation below is in doctest format, meaning that it serves as both documentation and tests. You can try out the code in a Python interpreter session.
Register the protobuf serializer with ZODB. This only needs to happen once per application run, but it does no harm if it happens more than once.
>>> from keas.pbpersist.pbformat import register >>> register() >>> register()
PContact is a simple Persistent class that stores its state in a protobuf message. See the source code; it’s quite simple. For this test, create an instance an fill out the required field.
>>> from keas.pbpersist.tests import PContact >>> bob = PContact() >>> bob.name = u'Bob'
Set up an in-memory object database and put the PContact in it. Of course, if you already have a database, you can instead attach the object to some object in your own database.
>>> from ZODB.DemoStorage import DemoStorage >>> from ZODB.DB import DB >>> import transaction >>> storage = DemoStorage() >>> db = DB(storage, database_name='main') >>> conn1 = db.open() >>> conn1.root()['bob'] = bob >>> transaction.commit()
Let’s take a peek at what got stored in the database. Look Ma, no pickles! The {protobuf} prefix tells ZODB to use code in the keas.pbpersist package to deserialize this object.
>>> data, serial = storage.load(bob._p_oid, '') >>> data '{protobuf}\n \n\x14keas.pbpersist.tests\x12\x08PContact\x12\x07\x08\x01\x12\x03Bob'
Here is a demonstration of how to crack open that record using nothing but language independent operations. The ObjectRecord message holds the object’s class, state, and references; the state field holds a class-specific protobuf message.
>>> from keas.pbpersist.persistent_pb2 import ObjectRecord >>> rec = ObjectRecord() >>> rec.MergeFromString(data[10:]) # skip the {protobuf} prefix 43 >>> rec.class_meta.module_name u'keas.pbpersist.tests' >>> rec.class_meta.class_name u'PContact' >>> len(rec.references) 0 >>> pbuf = PContact.protobuf_type() >>> pbuf.MergeFromString(rec.state) 7 >>> pbuf.name u'Bob'
You could easily emulate that code in C++ or Java, the other two languages supported by Protocol Buffers at this time. However, that was a noisy, complex operation. ZODB makes reading objects dramatically simpler.
>>> conn2 = db.open() >>> bob2 = conn2.root()['bob'] >>> bob2.name u'Bob' >>> conn2.close()
References
Create another PContact and assign that contact to be a guardian for Bob. The add() method below is part of Google’s protobuf API. The protobuf_refs attribute is provided by the ProtobufState metaclass in keas.pbstate.
>>> alice = PContact() >>> alice.name = u'Alice' >>> ref = bob.guardians.add() >>> bob.protobuf_refs.set(ref, alice) >>> transaction.commit()
[Ed: I sure don’t like the way we manage references right now. I’d much rather see “bob.guardians.append(alice)”, but that is not yet possible.]
Now let’s see what happens if we get Alice through Bob in a different ZODB connection.
>>> conn2 = db.open() >>> bob2 = conn2.root()['bob'] >>> alice2 = bob2.protobuf_refs.get(bob2.guardians[0]) >>> alice2.name u'Alice'
Note that alice2 is a duplicate of alice, not the same object, because the two come from different ZODB connections.
>>> alice is alice2 False
We also have the option of referring to objects by OID, but that is rarely necessary.
>>> conn3 = db.open() >>> bob3 = conn3[bob._p_oid]
ZODB loads all objects lazily, including objects serialized using this package. When a persistent object’s _p_changed attribute is None, the object is in the “ghost” state; it does not yet contain data. Accessing an attribute transparently loads the object state.
>>> bob3._p_changed is None True >>> bob3.name u'Bob' >>> bob3._p_changed False
Clean up the extra connections.
>>> conn3.close() >>> conn2.close()
Packing
Storages need to be able to follow chains of references in order to pack, so let’s make sure the referencesf() function still does what it should.
>>> from ZODB.serialize import referencesf >>> data, serial = storage.load(conn1.root()._p_oid, '') >>> referencesf(data) == [bob._p_oid] True >>> data, serial = storage.load(bob._p_oid, '') >>> referencesf(data) == [alice._p_oid] True >>> data, serial = storage.load(alice._p_oid, '') >>> referencesf(data) []
Reference Capabilities And Limitations
Objects serialized using this package can refer to any other Persistent objects. (Note that we already demonstrated that pickled objects can refer to protobuf serialized objects when we added “bob” to the root object in the database.)
>>> from persistent.mapping import PersistentMapping >>> len(bob.guardians) 1 >>> ref = bob.guardians.add() >>> bob.protobuf_refs.set(ref, PersistentMapping()) >>> transaction.commit() >>> len(bob.guardians) 2
However, objects serialized using this package can not hold a reference to non-Persistent objects. (This is a limitation of keas.pbpersist, but not a limitation of keas.pbstate, which does not depend on ZODB.) We don’t detect bad references until the first phase of transaction commit.
>>> len(bob.guardians) 2 >>> ref = bob.guardians.add() >>> bob.protobuf_refs.set(ref, {}) >>> len(bob.guardians) 3 >>> transaction.commit() Traceback (most recent call last): ... POSError: Protobuf reference target is not a Persistent object: {}
Aborting the transaction undoes the damage.
>>> transaction.abort() >>> len(bob.guardians) 2
We can make weak references, though!
>>> from persistent.wref import WeakRef >>> ref = bob.guardians.add() >>> bob.protobuf_refs.set(ref, WeakRef(PersistentMapping())) >>> transaction.commit() >>> len(bob.guardians) 3
We can make cross-database references.
>>> storage_beta = DemoStorage() >>> db_beta = DB(storage_beta, database_name='beta', databases=db.databases) >>> root_beta = conn1.get_connection('beta').root() >>> root_beta['obj'] = PersistentMapping() >>> transaction.commit() >>> ref = bob.guardians.add() >>> bob.protobuf_refs.set(ref, root_beta['obj']) >>> transaction.commit()
We can make a reference to an instance of a class with a __getnewargs__ method. ZODB does not store the class metadata in references to instances of such classes.
>>> from keas.pbpersist.tests import Adder >>> adder = Adder(7) >>> conn1.root()['adder'] = adder >>> ref = bob.guardians.add() >>> bob.protobuf_refs.set(ref, adder) >>> transaction.commit()
We can make a reference to an instance of a class with a __getnewargs__ method, even when that instance is in another database.
>>> adder = Adder(6) >>> root_beta['adder'] = adder >>> transaction.commit() >>> ref = bob.guardians.add() >>> bob.protobuf_refs.set(ref, adder) >>> transaction.commit()
Just to be sure all those references work, access all of them in another connection.
>>> conn2 = db.open() >>> bob2 = conn2.root()['bob'] >>> len(bob2.guardians) 6>>> bob2.protobuf_refs.get(bob2.guardians[0]) <keas.pbpersist.tests.PContact object at ...>>>> bob2.protobuf_refs.get(bob2.guardians[1]) {} >>> bob2.protobuf_refs.get(bob2.guardians[1])._p_jar is conn2 True>>> bob2.protobuf_refs.get(bob2.guardians[2]) <persistent.wref.WeakRef object at ...> >>> bob2.protobuf_refs.get(bob2.guardians[2])() {}>>> bob2.protobuf_refs.get(bob2.guardians[3]) {} >>> bob2.protobuf_refs.get(bob2.guardians[3])._p_jar is conn2 False>>> bob2.protobuf_refs.get(bob2.guardians[4]) <keas.pbpersist.tests.Adder object at ...> >>> bob2.protobuf_refs.get(bob2.guardians[4]).add(3) 10>>> bob2.protobuf_refs.get(bob2.guardians[5]) <keas.pbpersist.tests.Adder object at ...> >>> bob2.protobuf_refs.get(bob2.guardians[5]).add(3) 9>>> conn2.close()
Edge Cases
We can’t currently store instances of a ProtobufState Persistent class that implements __getnewargs__().
>>> from keas.pbpersist.tests import PContactWithGetNewArgs >>> fail_contact = PContactWithGetNewArgs() >>> fail_contact.name = u'Loser' >>> conn1.root()['fail_contact'] = fail_contact >>> transaction.commit() Traceback (most recent call last): ... POSError: ProtobufSerializer can not serialize classes using __getnewargs__ or __getinitargs__
Clean Up
The tests are finished. Close the object database.
>>> transaction.abort() >>> conn1.close() >>> db.close()
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.