Generate Avro Schemas from a Python class
Project description
Dataclasses Avro Schema Generator
Generate Avro Schemas from a Python class
Requirements
python 3.7+
Installation
pip install dataclasses-avroschema
Documentation
https://marcosschroh.github.io/dataclasses-avroschema/
Usage
Generating the avro schema
from dataclasses_avroschema import SchemaGenerator
class User:
"An User"
name: str
age: int
pets: typing.List[str]
accounts: typing.Dict[str, int]
favorite_colors: typing.Tuple[str] = ("BLUE", "YELLOW", "GREEN")
country: str = "Argentina"
address: str = None
SchemaGenerator(User).avro_schema()
'{
"type": "record",
"name": "User",
"doc": "An User",
"fields": [
{"name": "name", "type": "string"},
{"name": "age", "type": "int"},
{"name": "pets", "type": "array", "items": "string"},
{"name": "accounts", "type": "map", "values": "int"},
{"name": "favorite_colors", "type": "enum", "symbols": ["BLUE", "YELLOW", "GREEN"]},
{"name": "country", "type": ["string", "null"], "default": "Argentina"},
{"name": "address", "type": ["null", "string"], "default": "null"}
]
}'
Serialization to avro or avro-json
import typing
from dataclasses_avroschema import SchemaGenerator
@dataclass
class Address:
"An Address"
street: str
street_number: int
@dataclass
class User:
"User with multiple Address"
name: str
age: int
addresses: typing.List[Address]
address_data = {
"street": "test",
"street_number": 10,
}
# create an Address instance
address = Address(**address_data)
data_user = {
"name": "john",
"age": 20,
"addresses": [address],
}
# create an User instance
user = User(**data_user)
schema = SchemaGenerator(user)
schema.serialize()
# >>> b"\x08john(\x02\x08test\x14\x00"
schema.serialize(serialization_type="avro-json")
# >>> b'{"name": "john", "age": 20, "addresses": [{"street": "test", "street_number": 10}]}'
Deserialization
Deserialization could take place with an instance dataclass or the dataclass itself
import typing
from dataclasses_avroschema import SchemaGenerator
class Address:
"An Address"
street: str
street_number: int
class User:
"User with multiple Address"
name: str
age: int
addresses: typing.List[Address]
avro_binary = b"\x08john(\x02\x08test\x14\x00"
avro_json_binary = b'{"name": "john", "age": 20, "addresses": [{"street": "test", "street_number": 10}]}'
schema = SchemaGenerator(user)
schema.deserialize(avro_binary)
# >>> {"name": "john", "age": 20, "addresses": [{"street": "test", "street_number": 10}]}
schema.deserialize(avro_json_binary, serialization_type="avro-json")
# >>> {"name": "john", "age": 20, "addresses": [{"street": "test", "street_number": 10}]}
Features
- Primitive types: int, long, float, boolean, string and null support
- Complex types: enum, array, map, fixed, unions and records support
- Logical Types: date, time, datetime, uuid support
- Schema relations (oneToOne, oneToMany)
- Recursive Schemas
- Generate Avro Schemas from
faust.Record
- Instance serialization correspondent to
avro schema
generated - Data deserialization
Development
- Create a
virtualenv
:python3.7 -m venv venv && source venv/bin/activate
- Install requirements:
pip install -r requirements.txt
- Code linting:
./scripts/lint
- Run tests:
./scripts/test
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Close
Hashes for dataclasses-avroschema-0.12.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 120d3ca70a62b1c5b87c94475819bfa6263d61bbfa3d55ce796bd877e2463e49 |
|
MD5 | 2cd1a4f083f779fc26a3f53ee9aa02ce |
|
BLAKE2b-256 | 6ed55552b7c560fff9008d8edaa77d9e6535d12ce6d2d565b31092f40c544e9b |