Use the AWS Glue Schema Registry.

These details have not been verified by PyPI

Project links

Source

Project description

AWS Glue Schema Registry for Python

Use the AWS Glue Schema Registry in Python projects.

This library is a partial port of aws-glue-schema-registry which implements a subset of its features with full compatibility.

Feature Support

Feature	Java Library	Python Library	Notes
Serialization and deserialization using schema registry	✔️	✔️
Avro message format	✔️	✔️
JSON Schema message format	✔️	✔️
Kafka Streams support	✔️		N/A for Python, Kafka Streams is Java-only
Compression	✔️	✔️
Local schema cache	✔️	✔️
Schema auto-registration	✔️	✔️
Evolution checks	✔️	✔️
Migration from a third party Schema Registry	✔️	✔️
Flink support	✔️	❌
Kafka Connect support	✔️		N/A for Python, Kafka Connect is Java-only

Installation

Clone this repository and install it:

python setup.py install -e .

This library includes opt-in extra dependencies that enable support for certain features. For example, to use the schema registry with kafka-python, you should install the kafka-python extra:

python setup.py install -e .[kafka-python]

Extra name	Purpose
kafka-python	Provides adapter classes to plug into `kafka-python`

Usage

First use boto3 to create a low-level AWS Glue client:

import boto3

# Pass your AWS credentials or profile information here
session = boto3.Session(access_key_id=xxx, secret_access_key=xxx, region_name='us-west-2')

glue_client = session.client('glue')

See https://boto3.amazonaws.com/v1/documentation/api/latest/guide/quickstart.html#configuration for more information on configuring boto3.

Send Kafka messages with SchemaRegistrySerializer:

from aws_schema_registry import DataAndSchema, SchemaRegistryClient
from aws_schema_registry.avro import AvroSchema

# In this example we will use kafka-python as our Kafka client,
# so we need to have the `kafka-python` extras installed and use
# the kafka adapter.
from aws_schema_registry.adapter.kafka import KafkaSerializer
from kafka import KafkaConsumer

# Create the schema registry client, which is a façade around the boto3 glue client
client = SchemaRegistryClient(glue_client,
                              registry_name='my-registry')

# Create the serializer
serializer = KafkaSerializer(client)

# Create the producer
producer = KafkaProducer(value_serializer=serializer)

# Our producer needs a schema to send along with the data.
# In this example we're using Avro, so we'll load an .avsc file.
with open('user.avsc', 'r') as schema_file:
    schema = AvroSchema(schema_file.read())

# Send message data along with schema
data = {
    'name': 'John Doe',
    'favorite_number': 6
}
producer.send('my-topic', value=(data, schema))
# the value MUST be a tuple when we're using the KafkaSerializer

Read Kafka messages with SchemaRegistryDeserializer:

from aws_schema_registry import SchemaRegistryClient

# In this example we will use kafka-python as our Kafka client,
# so we need to have the `kafka-python` extras installed and use
# the kafka adapter.
from aws_schema_registry.adapter.kafka import KafkaDeserializer
from kafka import KafkaConsumer

# Create the schema registry client, which is a façade around the boto3 glue client
client = SchemaRegistryClient(glue_client,
                              registry_name='my-registry')

# Create the deserializer
deserializer = KafkaDeserializer(client)

# Create the consumer
consumer = KafkaConsumer('my-topic', value_deserializer=deserializer)

# Now use the consumer normally
for message in consumer:
    # The deserializer produces DataAndSchema instances
    value: DataAndSchema = message.value
    # which are NamedTuples with a `data` and `schema` property
    value.data == value[0]
    value.schema == value[1]
    # and can be deconstructed
    data, schema = value

Contributing

Clone this repository and install development dependencies:

pip install -e .[dev]

Run the linter and tests with tox before committing. After committing, check Github Actions to see the result of the automated checks.

Linting

Lint the code with:

flake8

Run the type checker with:

mypy

Tests

Tests go under the tests/ directory. All tests outside of tests/integration are unit tests with no external dependencies.

Tests under tests/integration are integration test that interact with external resources and/or real AWS schema registries. They generally run slower and require some additional configuration.

Run just the unit tests with:

pytest --ignore tests/integration

All integration tests use the following environment variables:

AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN
AWS_REGION
AWS_PROFILE
CLEANUP_REGISTRY: Set to any value to prevent the test from destroying the registry created during the test, allowing you to inspect its contents.

If no AWS_ environment variables are set, boto3 will try to load credentials from your default AWS profile.

See individual integration test directories for additional requirements and setup instructions.

Tox

This project uses Tox to run tests across multiple Python versions.

Install Tox with:

pip install tox

and run it with:

tox

Note that Tox requires the tested python versions to be installed. One convenient way to manage this is using pyenv. See the .python-versions file for the Python versions that need to be installed.

Releases

Assuming pypi permissions:

python -m build
twine upload -r testpypi dist/*
twine upload dist/*

Project details

These details have not been verified by PyPI

Project links

Source

Release history Release notifications | RSS feed

This version

1.1.3

Oct 17, 2023

1.1.2

Feb 11, 2022

1.1.1

Dec 1, 2021

1.1.0

Nov 21, 2021

1.0.0

Oct 11, 2021

1.0.0rc6 pre-release

Oct 8, 2021

1.0.0rc5 pre-release

Oct 5, 2021

1.0.0rc4 pre-release

Oct 5, 2021

1.0.0rc3 pre-release

Oct 5, 2021

1.0.0rc2 pre-release

Oct 4, 2021

1.0.0rc1 pre-release

Oct 4, 2021

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

aws-glue-schema-registry-1.1.3.tar.gz (19.2 kB view details)

Uploaded Oct 17, 2023 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

aws_glue_schema_registry-1.1.3-py3-none-any.whl (20.2 kB view details)

Uploaded Oct 17, 2023 Python 3

File details

Details for the file aws-glue-schema-registry-1.1.3.tar.gz.

File metadata

Download URL: aws-glue-schema-registry-1.1.3.tar.gz
Upload date: Oct 17, 2023
Size: 19.2 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for aws-glue-schema-registry-1.1.3.tar.gz
Algorithm	Hash digest
SHA256	`87dff8456d94cc09371e01d341a7ce91ae2d539958d297e43e8ae605e154d1e5`
MD5	`c5aa2ce379dd324e554230136aad8c07`
BLAKE2b-256	`981a78a073257d0201e1847fb36c6770fce774b67fa0898bf7666ae3a932e11b`

See more details on using hashes here.

File details

Details for the file aws_glue_schema_registry-1.1.3-py3-none-any.whl.

File metadata

Download URL: aws_glue_schema_registry-1.1.3-py3-none-any.whl
Upload date: Oct 17, 2023
Size: 20.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/4.0.2 CPython/3.11.4

File hashes

Hashes for aws_glue_schema_registry-1.1.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`590b95edd8b135c9ac74f985c05d117ea79ef6e56209ab90bc0b2fcc222e8fff`
MD5	`917bdfddcdf12d26bca72aaec1bfe274`
BLAKE2b-256	`cc32025a53122c8648d9acb896d3516c30714f3260a78ead44f90e199e18cfd2`

See more details on using hashes here.

aws-glue-schema-registry 1.1.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Project description

AWS Glue Schema Registry for Python

Feature Support

Installation

Usage

Contributing

Linting

Tests

Tox

Releases

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Classifiers

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes