Skip to main content

Library to convert Google BigQuery Table Schema into Json Schema

Project description

Google BigQuery Table Schema Converter

license pyversion coverage test downloads

Python library to convert Google BigQuery table schema into draft-07 json schema and vice versa.

The library includes two main modules:

gbqschema_converter
├── gbqschema_to_jsonschema.py
└── jsonschema_to_gbqschema.py

Each of those modules has two main functions:

  • json_representation: corresponds to json output (input for gbqschema_to_jsonschema).
  • sdk_representation: corresponds to Google Python SDK format output (input for gbqschema_to_jsonschema).

Installation

python3 -m venv env && source ${PWD}/env/bin/activate
(env) pip install --no-cache-dir gbqschema_converter

Usage: CLI

Convert json-schema to GBQ table schema

(env) json2gbq -h
usage: json2gbq [-h] (-i INPUT | -f FILE)

Google BigQuery Table Schema Converter

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        Input object as string.
  -f FILE, --file FILE  Input object as file path.

Example: stdin

Execution:

(env) json2gbq -i '{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "array",
  "items": {
    "$ref": "#/definitions/element"
  },
  "definitions": {
    "element": {
      "type": "object",
      "properties": {
        "att_01": {
          "type": "integer",
          "description": "Att 1"
        },
        "att_02": {
          "type": "number",
          "description": "Att 2"
        },
        "att_03": {
          "type": "string"
        },
        "att_04": {
          "type": "boolean"
        },
        "att_05": {
          "type": "string",
          "format": "date"
        },
        "att_06": {
          "type": "string",
          "format": "date-time"
        },
        "att_07": {
          "type": "string",
          "format": "time"
        }
      },
      "required": [
        "att_02"
      ]
    }
  }
}'

Output:

2020-04-08 21:42:51.700 [INFO ] [Google BigQuery Table Schema Converter] Output (5.52 ms elapsed):
[
  {
    "description": "Att 1",
    "name": "att_01",
    "type": "INTEGER",
    "mode": "NULLABLE"
  },
  {
    "description": "Att 2",
    "name": "att_02",
    "type": "NUMERIC",
    "mode": "REQUIRED"
  },
  {
    "name": "att_03",
    "type": "STRING",
    "mode": "NULLABLE"
  },
  {
    "name": "att_04",
    "type": "BOOLEAN",
    "mode": "NULLABLE"
  },
  {
    "name": "att_05",
    "type": "DATE",
    "mode": "NULLABLE"
  },
  {
    "name": "att_06",
    "type": "TIMESTAMP",
    "mode": "NULLABLE"
  },
  {
    "name": "att_07",
    "type": "STRING",
    "mode": "NULLABLE"
  }
]

Example: file

Execution:

(env) json2gbq -f ${PWD}/data/jsonschema.json

Output:

2020-04-08 21:57:25.516 [INFO ] [Google BigQuery Table Schema Converter] Output (6.39 ms elapsed):
[
  {
    "description": "Att 1",
    "name": "att_01",
    "type": "INTEGER",
    "mode": "NULLABLE"
  },
  {
    "description": "Att 2",
    "name": "att_02",
    "type": "NUMERIC",
    "mode": "REQUIRED"
  },
  {
    "name": "att_03",
    "type": "STRING",
    "mode": "NULLABLE"
  },
  {
    "name": "att_04",
    "type": "BOOLEAN",
    "mode": "NULLABLE"
  },
  {
    "name": "att_05",
    "type": "DATE",
    "mode": "NULLABLE"
  },
  {
    "name": "att_06",
    "type": "TIMESTAMP",
    "mode": "NULLABLE"
  },
  {
    "name": "att_07",
    "type": "STRING",
    "mode": "NULLABLE"
  }
]

Convert GBQ table schema to json-schema

(env) gbq2json -h
usage: gbq2json [-h] (-i INPUT | -f FILE)

Google BigQuery Table Schema Converter

optional arguments:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        Input object as string.
  -f FILE, --file FILE  Input object as file path.

Example: stdin

Execution:

(env) gbq2json -i '[
  {
    "description": "Att 1",
    "name": "att_01",
    "type": "INTEGER",
    "mode": "NULLABLE"
  },
  {
    "description": "Att 2",
    "name": "att_02",
    "type": "NUMERIC",
    "mode": "REQUIRED"
  },
  {
    "name": "att_03",
    "type": "STRING",
    "mode": "NULLABLE"
  },
  {
    "name": "att_04",
    "type": "BOOLEAN",
    "mode": "NULLABLE"
  },
  {
    "name": "att_05",
    "type": "DATE",
    "mode": "NULLABLE"
  },
  {
    "name": "att_06",
    "type": "DATETIME",
    "mode": "NULLABLE"
  },
  {
    "name": "att_07",
    "type": "TIMESTAMP",
    "mode": "NULLABLE"
  }
]'

Output:

2020-04-08 21:51:05.370 [INFO ] [Google BigQuery Table Schema Converter] Output (1.08 ms elapsed):
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "array",
  "items": {
    "$ref": "#/definitions/element"
  },
  "definitions": {
    "element": {
      "type": "object",
      "properties": {
        "att_01": {
          "type": "integer",
          "description": "Att 1"
        },
        "att_02": {
          "type": "number",
          "description": "Att 2"
        },
        "att_03": {
          "type": "string"
        },
        "att_04": {
          "type": "boolean"
        },
        "att_05": {
          "type": "string",
          "format": "date"
        },
        "att_06": {
          "type": "string",
          "pattern": "^[0-9]{4}-((|0)[1-9]|1[0-2])-((|[0-2])[1-9]|3[0-1])(|T)((|[0-1])[0-9]|2[0-3]):((|[0-5])[0-9]):((|[0-5])[0-9])(|.[0-9]{1,6})$"
        },
        "att_07": {
          "type": "string",
          "format": "date-time"
        }
      },
      "additionalProperties": false,
      "required": [
        "att_02"
      ]
    }
  }
}

Example: file

Execution:

(env) gbq2json -f ${PWD}/data/gbqschema.json

Output:

2020-04-08 21:55:20.275 [INFO ] [Google BigQuery Table Schema Converter] Output (1.72 ms elapsed):
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "array",
  "items": {
    "$ref": "#/definitions/element"
  },
  "definitions": {
    "element": {
      "type": "object",
      "properties": {
        "att_01": {
          "type": "integer",
          "description": "Att 1"
        },
        "att_02": {
          "type": "number",
          "description": "Att 2"
        },
        "att_03": {
          "type": "string"
        },
        "att_04": {
          "type": "boolean"
        },
        "att_05": {
          "type": "string",
          "format": "date"
        },
        "att_06": {
          "type": "string",
          "pattern": "^[0-9]{4}-((|0)[1-9]|1[0-2])-((|[0-2])[1-9]|3[0-1])(|T)((|[0-1])[0-9]|2[0-3]):((|[0-5])[0-9]):((|[0-5])[0-9])(|.[0-9]{1,6})$"
        },
        "att_07": {
          "type": "string",
          "format": "date-time"
        }
      },
      "additionalProperties": false,
      "required": [
        "att_02"
      ]
    }
  }
}

Usage: python program

Convert json-schema to GBQ table schema

Example: output as json

from gbqschema_converter.jsonschema_to_gbqschema import json_representation as converter

schema_in = {
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "array",
  "items": {
    "$ref": "#/definitions/element",
  },
  "definitions": {
    "element": {
      "type": "object",
      "properties": {
        "att_01": {
          "type": "integer",
          "description": "Att 1"
        },
        "att_02": {
          "type": "number",
        },
      }
      "required": [
        "att_02",
      ],
    },
  },
}

schema_out = converter(schema_in)
print(schema_out)

Output:

[{'description': 'Att 1', 'name': 'att_01', 'type': 'INTEGER', 'mode': 'NULLABLE'}, {'name': 'att_02', 'type': 'NUMERIC', 'mode': 'REQUIRED'}]

Example: output as list of SchemaField (SDK format)

from gbqschema_converter.jsonschema_to_gbqschema import sdk_representation as converter

schema_in = {
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "array",
  "items": {
    "$ref": "#/definitions/element",
  },
  "definitions": {
    "element": {
      "type": "object",
      "properties": {
        "att_01": {
          "type": "integer",
          "description": "Att 1"
        },
        "att_02": {
          "type": "number",
        },
      },
      "required": [
        "att_02",
      ],
    },
  },
}

schema_out = converter(schema_in)
print(schema_out)

Output:

[SchemaField('att_01', 'INTEGER', 'NULLABLE', 'Att 1', ()), SchemaField('att_02', 'NUMERIC', 'REQUIRED', None, ())]

Convert GBQ table schema to json-schema

Example: output as json

from gbqschema_converter.gbqschema_to_jsonschema import json_representation as converter

schema_in = [
    {
        'description': 'Att 1',
        'name': 'att_01',
        'type': 'INTEGER',
        'mode': 'NULLABLE'
    },
    {
        'name': 'att_02',
        'type': 'NUMERIC',
        'mode': 'REQUIRED'
    }
]

schema_out = converter(schema_in)
print(schema_out)

Output:

{'$schema': 'http://json-schema.org/draft-07/schema#', 'type': 'array', 'items': {'$ref': '#/definitions/element'}, 'definitions': {'element': {'type': 'object', 'properties': {'att_01': {
'type': 'integer', 'description': 'Att 1'}, 'att_02': {'type': 'number'}}, 'additionalProperties': False, 'required': ['att_02']}}}

Example: output as list of SchemaField (SDK format)

from gbqschema_converter.gbqschema_to_jsonschema import sdk_representation as converter
from google.cloud.bigquery import SchemaField

schema_in = [
    SchemaField('att_01', 'INTEGER', 'NULLABLE', 'Att 1', ()),
    SchemaField('att_02', 'NUMERIC', 'REQUIRED', None, ()),
]

schema_out = converter(schema_in)
print(schema_out)

Output:

{'$schema': 'http://json-schema.org/draft-07/schema#', 'type': 'array', 'items': {'$ref': '#/definitions/element'}, 'definitions': {'element': {'type': 'object', 'properties': {'att_01': {
'type': 'integer', 'description': 'Att 1'}, 'att_02': {'type': 'number'}}, 'additionalProperties': False, 'required': ['att_02']}}}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

gbqschema_converter-1.2.1.tar.gz (7.7 kB view hashes)

Uploaded Source

Built Distribution

gbqschema_converter-1.2.1-py3-none-any.whl (9.1 kB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page