Open Data Discovery Resource Name Generator
Project description
Open Data Discovery Resource Name Generator
Requirements
- Python >= 3.7
Installation
poetry add oddrn-generator
# or
pip install oddrn-generator
Usage and configuration
Available generators
- cassandra - CassandraGenerator
- postgresql - PostgresqlGenerator
- mysql - MysqlGenerator
- glue - GlueGenerator
- s3 - S3Generator
- kafka - KafkaGenerator
- kafkaconnect - KafkaConnectGenerator
- snowflake - SnowflakeGenerator
- airflow - AirflowGenerator
- hive - HiveGenerator
- dynamodb - DynamodbGenerator
- odbc - OdbcGenerator
- mssql - MssqlGenerator
- oracle - OracleGenerator
- redshift - RedshiftGenerator
- clickhouse - ClickHouseGenerator
- athena - AthenaGenerator
- quicksight - QuicksightGenerator
- dbt - DbtGenerator
- prefect - PrefectGenerator
- tableau - TableauGenerator
- neo4j - Neo4jGenerator
Work in progress generators
- kubeflow - KubeflowGenerator
- dvc - DVCGenerator
- great_expectations - GreatExpectationsGenerator
Generator properties
- base_oddrn - Get base oddrn (without path)
- available_paths - Get all available path of generator
Generator methods
- get_oddrn_by_path(path_name, new_value=None) - Get oddrn string by path. You also can set value for this path using 'new_value' param
- set_oddrn_paths(**kwargs) - Set or update values of oddrn path
- get_data_source_oddrn() - Get data source oddrn
Generator parameters:
- host_settings: str - optional. Hostname configuration
- cloud_settings: dict - optional. Cloud configuration
- **kwargs - path's name and values
Example usage
# postgresql
from oddrn_generator import PostgresqlGenerator
oddrn_gen = PostgresqlGenerator(
host_settings='my.host.com:5432',
schemas='schema_name', databases='database_name', tables='table_name'
)
oddrn_gen.base_oddrn
# //postgresql/host/my.host.com:5432
oddrn_gen.available_paths
# ('schemas', 'databases', 'tables', 'columns')
oddrn_gen.get_data_source_oddrn()
# //postgresql/host/my.host.com:5432/schemas/schema_name/databases/database_name
oddrn_gen.get_oddrn_by_path("schemas")
# //postgresql/host/my.host.com:5432/schemas/schema_name
oddrn_gen.get_oddrn_by_path("databases")
# //postgresql/host/my.host.com:5432/schemas/schema_name/databases/database_name
oddrn_gen.get_oddrn_by_path("tables")
# //postgresql/host/my.host.com:5432/schemas/schema_name/databases/database_name/tables/table_name
# you can set or change path:
oddrn_gen.set_oddrn_paths(tables='another_table_name', columns='new_column_name')
oddrn_gen.get_oddrn_by_path("columns")
# //postgresql/host/my.host.com:5432/schemas/schema_name/databases/database_name/tables/another_table_name/columns/new_column_name
# you can get path wih new values:
oddrn_gen.get_oddrn_by_path("columns", new_value="another_new_column_name")
# //postgresql/host/my.host.com:5432/schemas/schema_name/databases/database_name/tables/another_table_name/columns/another_new_column_name
# glue
from oddrn_generator import GlueGenerator
oddrn_gen = GlueGenerator(
cloud_settings={'account': 'acc_id', 'region':'reg_id'},
databases='database_name', tables='table_name', columns='column_name',
jobs='job_name', runs='run_name', owners='owner_name'
)
oddrn_gen.available_paths
# ('databases', 'tables', 'columns', 'owners', 'jobs', 'runs')
oddrn_gen.get_oddrn_by_path("databases")
# //glue/cloud/aws/account/acc_id/region/reg_id/databases/database_name
oddrn_gen.get_oddrn_by_path("tables")
# //glue/cloud/aws/account/acc_id/region/reg_id/databases/database_name/tables/table_name'
oddrn_gen.get_oddrn_by_path("columns")
# //glue/cloud/aws/account/acc_id/region/reg_id/databases/database_name/tables/table_name/columns/column_name
oddrn_gen.get_oddrn_by_path("jobs")
# //glue/cloud/aws/account/acc_id/region/reg_id/jobs/job_name
oddrn_gen.get_oddrn_by_path("runs")
# //glue/cloud/aws/account/acc_id/region/reg_id/jobs/job_name/runs/run_name
oddrn_gen.get_oddrn_by_path("owners")
# //glue/cloud/aws/account/acc_id/region/reg_id/owners/owner_name
Exceptions
- WrongPathOrderException - raises when trying set path that depends on another path
from oddrn_generator import PostgresqlGenerator
oddrn_gen = PostgresqlGenerator(
host_settings='my.host.com:5432',
schemas='schema_name', databases='database_name',
columns='column_without_table'
)
# WrongPathOrderException: 'columns' can not be without 'tables' attribute
- EmptyPathValueException - raises when trying to get a path that is not set up
from oddrn_generator import PostgresqlGenerator
oddrn_gen = PostgresqlGenerator(
host_settings='my.host.com:5432', schemas='schema_name', databases='database_name',
)
oddrn_gen.get_oddrn_by_path("tables")
# EmptyPathValueException: Path 'tables' is not set up
- PathDoestExistException - raises when trying to get not existing oddrn path
from oddrn_generator import PostgresqlGenerator
oddrn_gen = PostgresqlGenerator(
host_settings='my.host.com:5432', schemas='schema_name', databases='database_name',
)
oddrn_gen.get_oddrn_by_path("jobs")
# PathDoestExistException: Path 'jobs' doesn't exist in generator
Development
#Install dependencies
poetry install
#Activate shell
poetry shell
# Run tests
python run pytest
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
oddrn-generator-0.1.29.tar.gz
(12.1 kB
view hashes)
Built Distribution
Close
Hashes for oddrn_generator-0.1.29-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 24268615389696c7f58040ed780cc6d0e4de5151106a9e1017ef8f4b8b9d0b85 |
|
MD5 | 08f6a178cd6acb3c1982be4c79a591c9 |
|
BLAKE2b-256 | a03bdbfdee0bab7801b3447684180d7017e829cbc4487cd4e17e87a08a6fc24c |