Convert pydantic model to aws glue schema for terraform
Project description
JSON Schema to AWS Glue schema converter
Installation
pip install pydantic-glue
What?
Converts pydantic
schemas to json schema
and then to AWS glue schema
,
so in theory anything that can be converted to JSON Schema could also work.
Why?
When using AWS Kinesis Firehose
in a configuration that receives JSONs and writes parquet
files on S3,
one needs to define a AWS Glue
table so Firehose knows what schema to use when creating the parquet files.
AWS Glue lets you define a schema using Avro
or JSON Schema
and then to create a table from that schema,
but as of *May 2022`
there are limitations on AWS that tables that are created that way can't be used with Kinesis Firehose.
https://stackoverflow.com/questions/68125501/invalid-schema-error-in-aws-glue-created-via-terraform
This is also confirmed by AWS support.
What one could do is create a table set the columns manually, but this means you now have two sources of truth to maintain.
This tool allows you to define a table in pydantic
and generate a JSON with column types that can be used with terraform
to create a Glue table.
Example
Take the following pydantic class
from pydantic import BaseModel
from typing import List
class Bar(BaseModel):
name: str
age: int
class Foo(BaseModel):
nums: List[int]
bars: List[Bar]
other: str
Running pydantic-glue
pydantic-glue -f example.py -c Foo
you get this JSON in the terminal:
{
"//": "Generated by pydantic-glue at 2022-05-25 12:35:55.333570. DO NOT EDIT",
"columns": {
"nums": "array<int>",
"bars": "array<struct<name:string,age:int>>",
"other": "string"
}
}
and can be used in terraform like that
locals {
columns = jsondecode(file("${path.module}/glue_schema.json")).columns
}
resource "aws_glue_catalog_table" "table" {
name = "table_name"
database_name = "db_name"
storage_descriptor {
dynamic "columns" {
for_each = local.columns
content {
name = columns.key
type = columns.value
}
}
}
}
Alternatively you can run CLI with -o
flag to set output file location:
pydantic-glue -f example.py -c Foo -o example.json -l
How it works?
pydantic
gets converted to JSON Schema- the JSON Schema types get mapped to Glue types recursively
Future work
- Not all types are supported, I just add types as I need them, but adding types is very easy, feel free to open issues or send a PR if you stumbled upon a non-supported use case
- the tool could be easily extended to working with JSON Schema directly
- thus, anything that can be converted to a JSON Schema should also work.
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pydantic_glue-0.5.0.tar.gz
.
File metadata
- Download URL: pydantic_glue-0.5.0.tar.gz
- Upload date:
- Size: 4.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.1 Linux/6.5.0-1023-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8471f1c4f27fd5ea80856c69869e8e52720c3a3b5e4cc6ebcfcab42d90c385e2 |
|
MD5 | 7f4893967a6c69bc2e8e6c7a8faafb27 |
|
BLAKE2b-256 | 7c305c43465c54a50e4fd82864f119d7c995ce4829c9a49bf6a054cca09b3acb |
File details
Details for the file pydantic_glue-0.5.0-py3-none-any.whl
.
File metadata
- Download URL: pydantic_glue-0.5.0-py3-none-any.whl
- Upload date:
- Size: 5.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: poetry/1.8.3 CPython/3.12.1 Linux/6.5.0-1023-azure
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 118fc42faba72e3d6875b9032473928300de02f0daab63bfd7501b707149e90c |
|
MD5 | 1401bb469e701a317b93fa231ee60da5 |
|
BLAKE2b-256 | e42712ccec3610e35687082447c0ac6eee6e782f5e5af616ca9ed2b6b2295be1 |