Python implementation of Protocol Buffer (protobuf) data types
Project description
protobuf
My own implementation of Google's Protocol Buffers.
Changes in v0.3.1
encoding
module becameprotobuf
module.- Performance tests.
Bool.dump
2.2 times faster.Varint
14% faster.add_field
chaining.__hash__
17% faster.
Changes in v0.3
- README techniques added.
- Hashes of message types.
- Fixed: loading of missing required field doesn't raise
ValueError
. - Message
load
doesn't useStringIO
for reading embedded messages and packed repeated fields anymore. TypeMetadata
! (read below)- Removed
MarshalableCode
(it's notprotobuf
's business). - Fixed: reading of
Int32
values raisesTypeError: 'str' object is not callable
Changes in v0.2
- Fixed
Int32
type name (wasInt32Type
). - Added validation of message type.
- Unicode type.
- Python code object type.
- Fixed casting values to bool and from bool.
Using
Fow now, there is full protobuf encoding implementation, so you can use the encoding
module with full compatibility with the standard implementation.
The encoding
module is covered with tests, but you should understand that there are may be some unknown bugs. Thus, use this software at your own risk.
Do from encoding import *
and you're ready to go.
Note: all names of message types are similar to described there. ;-)
Sample 1. Introduction
Assume you have the following definition:
message Test2 {
string b = 2;
}
First, you should create the message type:
Test2 = MessageType()
Test2.add_field(2, 'b', String)
Then, create a message and fill it with the appropriate data:
msg = Test2()
msg.b = 'testing'
You can dump this now!
print msg.dumps() # This will dump into a string.
msg.dump(open('/tmp/message', 'wb')) # And this will dump into any write-like object.
You also can load this message with:
msg = Test2.load(open('/tmp/message', 'rb'))
or with:
msg = load(open('/tmp/message', 'rb'), Test2)
Simple enough. :)
Sample 2. Required field
To add a missing field you should pass an additional flags
parameter to add_field
like this:
Test2 = MessageType()
Test2.add_field(2, 'b', String, flags=Flags.REQUIRED)
If you'll not fill a required field, then ValueError will be raised during serialization.
Sample 3. Repeated field
Do like this:
Test2 = MessageType()
Test2.add_field(1, 'b', UVarint, flags=Flags.REPEATED)
msg = Test2()
msg.b = (1, 2, 3)
A value of repeated field can be any iterable object. The loaded value will always be list
.
Sample 4. Packed repeated field
Test4 = MessageType()
Test4.add_field(4, 'd', UVarint, flags=Flags.PACKED_REPEATED)
msg = Test4()
msg.d = (3, 270, 86942)
Sample 5. Embedded messages
Consider the following definitions:
message Test1 {
int32 a = 1;
}
and
message Test3 {
required Test1 c = 3;
}
To create an embedded field, pass EmbeddedMessage as the type of field and fill it like this:
# Create the type.
Test1 = MessageType()
Test1.add_field(1, 'a', UVarint)
Test3 = MessageType()
Test3.add_field(3, 'c', EmbeddedMessage(Test1))
# Fill the message.
msg = Test3()
msg.c = Test1()
msg.c.a = 150
Data types
There are the following data types supported for now:
UVarint # Unsigned integer.
Varint # Signed integer.
Bool # Boolean.
Fixed64 # 8-byte string.
UInt64 # C++'s 64-bit `unsigned long long`
Int64 # C++'s 64-bit `long long`
Float64 # C++'s `double`.
Fixed32 # 4-byte string.
UInt32 # C++'s 32-bit `unsigned int`.
Int32 # C++'s 32-bit `int`.
Float32 # C++'s `float`.
Bytes # Pure bytes string.
Unicode # Unicode string.
TypeMetadata # Type that describes another type.
Some techniques
Streaming messages
The Protocol Buffer format is not self delimiting. But you can wrap you message type in EmbeddedMessage
class and write/read it sequentially.
The other option is to use protobuf.EofWrapper
that has a limit
parameter in its constructor. The EofWrapper
raises EOFError
when the specified number of bytes is read.
Self-describing messages and TypeMetadata
There is no any description of the message type in a message itself. Therefore, if you want to send a self-described messages, you should send the a description of the message too.
I've implemented a tool for this... Look:
A, B, C = MessageType(), MessageType(), MessageType()
A.add_field(1, 'a', UVarint)
A.add_field(2, 'b', TypeMetadata, flags=Flags.REPEATED) # <- Look here!
A.add_field(3, 'c', Bytes)
B.add_field(4, 'ololo', Float32)
B.add_field(5, 'c', TypeMetadata, flags=Flags.REPEATED) # <- And here!
B.add_field(6, 'd', Bool, flags=Flags.PACKED_REPEATED)
C.add_field(7, 'ghjhdf', UVarint)
msg = A()
msg.a = 1
msg.b = [B, C] # Assigning of types.
msg.c = 'ololo'
bytes = msg.dumps()
...
msg = A.loads(bytes)
msg2 = msg.b[0]() # Creating a message of the loaded type.
You can send your bytes
anywhere and you'll got your message type on the other side!
add_field chaining
add_field
return the message type itself, thus you can do so:
MessageType().add_field(1, 'a', EmbeddedMessage(MessageType().add_field(1, 'a', UVarint)))
More info
See protobuf
to see the API and run-tests
modules to see more usage samples.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for pure_protobuf-0.4.0-py2-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | fcee6c32bd7fca9b579ab0311e27f7a18f4299cf90f518b8031b141c6b13ead4 |
|
MD5 | 497af341ac99974d242de11150f06994 |
|
BLAKE2b-256 | f269b74ea0b532152be9fcce09ec308373ac3fce5c5c1d95a464d0cb6f043e63 |