Convert from protobuf to arrow and back
Project description
Protarrow
A library for converting from protobuf to arrow and back
Installation
pip install protarrow
Usage
Convert from proto to arrow
message MyProto {
string name = 1;
int32 id = 2;
repeated int32 values = 3;
}
import protarrow
my_protos = [
MyProto(name="foo", id=1, values=[1, 2, 4]),
MyProto(name="bar", id=2, values=[3, 4, 5]),
]
schema = protarrow.message_type_to_schema(MyProto)
record_batch = protarrow.messages_to_record_batch(my_protos, MyProto)
table = protarrow.messages_to_table(my_protos, MyProto)
name | id | values |
---|---|---|
foo | 1 | [1 2 4] |
bar | 2 | [3 4 5] |
Convert from arrow to proto
protos_from_record_batch = protarrow.table_to_messages(record_batch, MyProto)
protos_from_table = protarrow.table_to_messages(table, MyProto)
Customize arrow type
The arrow type for Enum
, Timestamp
and TimeOfDay
can be configured:
config = protarrow.ProtarrowConfig(
enum_type=pa.int32(),
timestamp_type=pa.timestamp("ms", "America/New_York"),
time_of_day_type=pa.time32("ms"),
)
record_batch = protarrow.messages_to_record_batch(my_protos, MyProto, config)
Type Mapping
Native Types
Proto | Pyarrow | Note |
---|---|---|
bool | bool_ | |
bytes | binary | |
double | float64 | |
enum | int32/string/binary | configurable |
fixed32 | int32 | |
fixed64 | int64 | |
float | float32 | |
int32 | int32 | |
int64 | int64 | |
message | struct | |
sfixed32 | int32 | |
sfixed64 | int64 | |
sint32 | int32 | |
sint64 | int64 | |
string | string | |
uint32 | uint32 | |
uint64 | uint64 |
Other types
Proto | Pyarrow | Note |
---|---|---|
repeated | list_ | |
map | map_ | |
google.protobuf.BoolValue | bool_ | |
google.protobuf.BytesValue | binary | |
google.protobuf.DoubleValue | float64 | |
google.protobuf.FloatValue | float32 | |
google.protobuf.Int32Value | int32 | |
google.protobuf.Int64Value | int64 | |
google.protobuf.StringValue | string | |
google.protobuf.Timestamp | timestamp("ns", "UTC") | Unit and timezone are configurable |
google.protobuf.UInt32Value | uint32 | |
google.protobuf.UInt64Value | uint64 | |
google.type.Date | date32() | |
google.type.TimeOfDay | time64/time32 | Unit and type are configurable |
Nullability
- Top level native field, list and maps are marked as non-nullable.
- Any nested message and their children are nullable
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
protarrow-0.0.1rc6.tar.gz
(14.4 kB
view hashes)
Built Distribution
Close
Hashes for protarrow-0.0.1rc6-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | af59b7f876daaed2533b1c9cb1a06316bdd26d0cdcff307b074b98a3794f87e6 |
|
MD5 | 02d4ff7241e9e1ef9e32c32942fe54b7 |
|
BLAKE2b-256 | 0ac86893f919eb7cb50303af31ad2b30c713e3abbf683c3b5c22b2304175b42f |