Ingestion Framework for OpenMetadata
Project description
This guide will help you setup the Ingestion framework and connectors
Setup Ingestion
Ingestion is a data ingestion library, which is inspired by Apache Gobblin. It could be used in an orchestration framework(e.g. Apache Airflow) to build data for OpenMetadata. Prerequisites
- Python >= 3.8.x
Install From PyPI
python3 -m pip install --upgrade pip wheel setuptools
python3 -m pip install --upgrade openmetadata-ingestion
python3 -m spacy download en_core_web_sm
Install Ingestion Connector Dependencies
Sources:
Plugin Name | Install Command | Provides |
---|---|---|
athena | pip install 'openmetadata-ingestion[athena]' |
AWS Athena |
bigquery | pip install 'openmetadata-ingestion[bigquery]' |
BigQuery |
bigquery-usage | pip install 'openmetadata-ingestion[bigquery-usage]' |
BigQuery usage |
hive | pip install 'openmetadata-ingestion[hive]' |
Hive |
ldap-users | pip install 'openmetadata-ingestion[ldap-users]' |
LDAP |
mssql | pip install 'openmetadata-ingestion[mssql]' |
SQL Server |
mssql-odbc | pip install 'openmetadata-ingestion[mssql-odbc]' |
SQL Server ODBC |
mysql | pip install 'openmetadata-ingestion[mysql]' |
MySQL |
oracle | pip install 'openmetadata-ingestion[oracle]' |
Oracle |
postgres | pip install 'openmetadata-ingestion[postgres]' |
Postgres |
redshift | pip install 'openmetadata-ingestion[redshift]' |
Redshift |
redshift-usage | pip install 'openmetadata-ingestion[redshift-usage]' |
Redshift Usage |
snowflake | pip install 'openmetadata-ingestion[snowflake]' |
Snowflake |
snowflake-usage | pip install 'openmetadata-ingestion[snowflake-usage]' |
Snowflake usage |
elasticsearch | pip install 'openmetadata-ingestion[elasticsearch]' |
Elastic Search |
sample-tables | pip install 'openmetadata-ingestion[sample-tables]' |
Sample Tables |
Generate Redshift Data
metadata ingest -c ./pipelines/redshift.json
Generate Redshift Usage Data
metadata ingest -c ./pipelines/redshift_usage.json
Generate Sample Tables
metadata ingest -c ./pipelines/sample_tables.json
Generate Sample Users
metadata ingest -c ./pipelines/sample_users.json
Ingest MySQL data to Metadata APIs
metadata ingest -c ./pipelines/mysql.json
Ingest Bigquery data to Metadata APIs
export GOOGLE_APPLICATION_CREDENTIALS="$PWD/pipelines/creds/bigquery-cred.json"
metadata ingest -c ./pipelines/bigquery.json
Index Metadata into ElasticSearch
Run ElasticSearch docker
docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.10.2
Run ingestion connector
metadata ingest -c ./pipelines/metadata_to_es.json
Changelog
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Close
Hashes for openmetadata-ingestion-0.2.0.tar.gz
Algorithm | Hash digest | |
---|---|---|
SHA256 | 27c9515db119f2cee4e9c8731212aad1859b9a82035ed93940d13a894f379097 |
|
MD5 | b06a96f0eba1ab9098d72b6c262ef99b |
|
BLAKE2b-256 | 63b6219f95405874609bd2f5fac5d17cd25bd1055b58f4e47ab77bd1fdc09b59 |
Close
Hashes for openmetadata_ingestion-0.2.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 7adc9ca65b1c0fd4ddb2eabc69d9a273036603ef16d7a732b04985033b7d976f |
|
MD5 | 7ea4bff87a1c58603a5b3cece8198ed4 |
|
BLAKE2b-256 | d7234b13d18dd9505f134b193fbb2d49afa7d3b021ea02bc9984fbb41e5482a5 |