Skip to main content

Common functionality build on top of Apache Avro

Project description

# Data Pipeline Avro Util


What is it?
-----------
The Data Pipeline Avro utility package provides a Pythonic interface
for reading and writing Avro schemas. It also provides an enum class
for metadata that we've found useful to include in our schemas.


Download and Install
---------------------------
```
git clone git@github.com:Yelp/data_pipeline_avro_util.git
pip install data_pipeline_avro_util
```


Tests
-----
Running unit tests
```
make test
```


Usage
-----
Using Avro Schema Builder::
```
from data_pipeline_avro_util.avro_builder import AvroSchemaBuilder
from data_pipeline_avro_util.data_pipeline.avro_meta_data import AvroMetaDataKeys

avro_builder = AvroSchemaBuilder()
avro_builder.begin_record(
name="test_name",
namespace="test_namespace",
doc="test_doc"
)
avro_builder.add_field(
name = "key1",
typ = "string", # datatype of this field is string
doc="test_doc1",
metadata={
AvroMetaDataKeys.PRIMARY_KEY: 1 # first primary key
}
)
avro_builder.add_field(
name = "key2",
typ = "string",
doc="test_doc2"
)
record_json = avro_builder.end()
print record_json

{
"type": "record",
"namespace": "test_namespace",
"name": "test_name",
"doc": "test_doc",
"fields": [
{"type": "string", "doc": "test_doc1", "name": "key1", "pkey": True},
{"type": "string", "doc": "test_doc2", "name": "key2"}
]
}
```


Disclaimer
-------
We're still in the process of setting up this package as a stand-alone. There may be additional work required to run code and integrate with other applications.


License
-------
Data Pipeline Avro Util is licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0


Contributing
------------
Everyone is encouraged to contribute to Data Pipeline Avro Util by forking the Github repository and making a pull request or opening an issue.



Documentation
-------------

The full documentation is at
TODO (DATAPIPE-2030|abrar): upload servicedocs to public server.



History
-------

0.1.0 (2015-01-29)
++++++++++++++++++

* First release.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_pipeline_avro_util-0.2.3.tar.gz (12.6 kB view details)

Uploaded Source

Built Distribution

data_pipeline_avro_util-0.2.3-py2.py3-none-any.whl (16.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file data_pipeline_avro_util-0.2.3.tar.gz.

File metadata

File hashes

Hashes for data_pipeline_avro_util-0.2.3.tar.gz
Algorithm Hash digest
SHA256 b632b6ecf8d139d3958ddc669a48395cbc49e95b1680fbe8269c0b7b1e64fb4e
MD5 c034e81a65e111eaf4dc8ffadc5ca346
BLAKE2b-256 f17fbed3543da77253e7bbc59c910ab18eeb58630ff162ca48a270484456966d

See more details on using hashes here.

File details

Details for the file data_pipeline_avro_util-0.2.3-py2.py3-none-any.whl.

File metadata

File hashes

Hashes for data_pipeline_avro_util-0.2.3-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 d7720bb1c303014ca87423b3cc006a1ecb0ad63454ab364a2deafa2d0397dad3
MD5 660b6821e79d077c5cde734295555efa
BLAKE2b-256 e43c1b537e17ae2b6920a7d3cd7eb0e56aa2fd0dbdb5332601f1fd2621aec56d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page