Hera2 Python SDK for metadata ingestion and management
Project description
Hera2 Python SDK
A modern, fluent Python SDK for OpenMetadata that provides an intuitive API for all operations. Authentication is routed through the Heimdall authorization service for DataOS integration.
Replaces openmetadata-ingestion for SDK use: Installing hera2-sdk pulls in openmetadata-ingestion as a dependency and shares the metadata namespace, so you get both metadata.sdk (this package) and the full ingestion stack: metadata.ingestion, metadata.generated, metadata.clients, metadata.profiler, metadata.utils, metadata.workflow, etc. Use a single venv with hera2-sdk and you have the same top-level metadata/ surface as with openmetadata-ingestion alone, plus metadata.sdk.
Installation
pip install hera2-sdk
If you see an error like No matching distribution found for openmetadata-ingestion==1.11.8.x (that exact patch version is not on PyPI), install from the repo:
pip install /path/to/hera2/hera2-sdk
Or from the project root: pip install -e hera2-sdk for an editable install.
Data Quality SDK Installation
For running data quality tests, additional dependencies may be required:
DataFrame Validation:
pip install 'hera2-sdk[pandas]'
Table-Based Testing:
pip install 'hera2-sdk[mysql]' # For MySQL
pip install 'hera2-sdk[postgres]' # For PostgreSQL
pip install 'hera2-sdk[snowflake]' # For Snowflake
pip install 'hera2-sdk[clickhouse]' # For ClickHouse
Quick Start
Configure the SDK (Heimdall Auth — Recommended)
Use heimdallConfiguration (same structure as hera/config/config.yaml
authenticationConfiguration.heimdallConfiguration):
from metadata.sdk import configure
configure(
host="http://localhost:8585/api",
api_key="your-dataos-api-key",
heimdall_configuration={
"enabled": True,
"baseUrl": "https://your-instance.dataos.cloud/heimdall",
"timeout": 10,
"fallbackOnBasic": True,
},
)
Or use the legacy heimdall_url:
configure(
host="http://localhost:8585/api",
api_key="your-dataos-api-key",
heimdall_url="https://your-instance.dataos.cloud/heimdall",
)
Or set environment variables and call configure() with no arguments:
export OPENMETADATA_HOST="http://localhost:8585/api"
export OPENMETADATA_API_KEY="your-dataos-api-key"
export HEIMDALL_BASE_URL="https://your-instance.dataos.cloud/heimdall"
from metadata.sdk import configure
configure()
Configure Parameters
The configure() function supports:
hostorserver_url: OpenMetadata server URLapi_keyorjwt_token: DataOS API key or JWT tokenheimdall_configuration: Dict matchinghera/config/config.yamlheimdallConfiguration(enabled, baseUrl, timeout, fallbackOnBasic, trustAll)heimdall_url: Heimdall base URL (legacy; use heimdall_configuration when possible)- Falls back to environment variables:
OPENMETADATA_HOSTorOPENMETADATA_SERVER_URLfor the server URLOPENMETADATA_API_KEYorOPENMETADATA_JWT_TOKENfor authenticationHEIMDALL_BASE_URL: Heimdall service URL (enables Heimdall auth)HEIMDALL_TIMEOUT: Heimdall request timeout in seconds (default: 10)HEIMDALL_TRUST_ALL: Trust all SSL certs for Heimdall (default: true)OPENMETADATA_VERIFY_SSL: Enable SSL verification (default: false)OPENMETADATA_CA_BUNDLE: Path to CA bundleOPENMETADATA_CLIENT_TIMEOUT: Client timeout in seconds (default: 30)
Alternative: Builder Pattern
from metadata.sdk.config import OpenMetadataConfig
config = (
OpenMetadataConfig.builder()
.server_url("http://localhost:8585/api")
.api_key("your-dataos-api-key")
.heimdall_configuration({
"enabled": True,
"baseUrl": "https://your-instance.dataos.cloud/heimdall",
"timeout": 15,
"fallbackOnBasic": True,
})
.build()
)
Or with flat params: .heimdall_url("...").heimdall_timeout(15).
Alternative: Direct JWT (Legacy)
If Heimdall is not available, the SDK falls back to direct JWT authentication:
from metadata.sdk import configure
configure(host="http://localhost:8585/api", jwt_token="your-om-jwt-token")
Using like the OpenMetadata Python SDK
hera2-sdk depends on openmetadata-ingestion, so you can use the same low-level API as in the official docs: OpenMetadataConnection + OpenMetadata(server_config), then metadata.create_or_update(), metadata.get_by_name(), metadata.delete(), etc.
Option 1 — Standard OpenMetadata style (same as the docs)
from metadata.ingestion.ometa.ometa_api import OpenMetadata
from metadata.generated.schema.entity.services.connections.metadata.openMetadataConnection import (
OpenMetadataConnection,
AuthProvider,
)
from metadata.generated.schema.security.client.openMetadataJWTClientConfig import (
OpenMetadataJWTClientConfig,
)
from metadata.generated.schema.entity.data.table import Table
server_config = OpenMetadataConnection(
hostPort="http://localhost:8585/api",
authProvider=AuthProvider.openmetadata,
securityConfig=OpenMetadataJWTClientConfig(
jwtToken="<YOUR-INGESTION-BOT-JWT-TOKEN>",
),
)
metadata = OpenMetadata(server_config)
# Same API as in the docs
metadata.health_check()
service_entity = metadata.create_or_update(data=create_service)
my_table = metadata.get_by_name(entity=Table, fqn="test-service-table.test-db.test-schema.test")
metadata.delete(entity=Table, entity_id=my_table.id)
Option 2 — hera2-sdk wrapper (same API + optional Heimdall)
Use configure() or OpenMetadataConfig, then get the underlying client via .ometa and call the same methods:
from metadata.sdk import configure, client
from metadata.generated.schema.entity.data.table import Table
configure(
host="http://localhost:8585/api",
api_key="your-dataos-api-key",
heimdall_configuration={
"enabled": True,
"baseUrl": "https://your-instance.dataos.cloud/heimdall",
"timeout": 10,
"fallbackOnBasic": True,
},
)
metadata = client().ometa # same interface as OpenMetadata(server_config)
metadata.health_check()
service_entity = metadata.create_or_update(data=create_service)
my_table = metadata.get_by_name(entity=Table, fqn="test-service-table.test-db.test-schema.test")
metadata.delete(entity=Table, entity_id=my_table.id)
So you can follow the OpenMetadata SDK walkthrough (create DatabaseService, Database, Schema, Table, etc.) with either the raw OpenMetadata from metadata.ingestion.ometa.ometa_api or with client().ometa after configuring hera2-sdk.
Manual Initialization
For more control, you can manually initialize the SDK:
from metadata.sdk import OpenMetadata, OpenMetadataConfig
from metadata.sdk.entities import Table, User
from metadata.sdk.api import Search, Lineage, Bulk
config = OpenMetadataConfig(
server_url="http://localhost:8585/api",
api_key="your-dataos-api-key",
heimdall_configuration={
"enabled": True,
"baseUrl": "https://your-instance.dataos.cloud/heimdall",
"timeout": 10,
"fallbackOnBasic": True,
},
)
client = OpenMetadata.initialize(config)
Table.set_default_client(client)
User.set_default_client(client)
Search.set_default_client(client)
Lineage.set_default_client(client)
Bulk.set_default_client(client)
Configuration from Environment Variables Only
from metadata.sdk.config import OpenMetadataConfig
# Reads from OPENMETADATA_HOST, OPENMETADATA_API_KEY, HEIMDALL_BASE_URL, etc.
config = OpenMetadataConfig.from_env()
Entity Operations
Tables
from metadata.generated.schema.api.data.createTable import CreateTableRequest
from metadata.sdk.entities.table import TableListParams
# Create a table
request = CreateTableRequest(
name="my_table",
databaseSchema="my_schema",
columns=[...]
)
table = Table.create(request)
# Retrieve a table by ID
table = Table.retrieve("table-id")
# Retrieve by fully qualified name with specific fields
table = Table.retrieve_by_name(
"service.database.schema.table",
fields=["owners", "tags", "columns"]
)
# List tables with pagination
for table in Table.list().auto_paging_iterable():
print(table.name)
# List with filters
params = TableListParams.builder() \
.limit(50) \
.database("my_database") \
.fields(["owners", "tags"]) \
.build()
tables = Table.list(params)
# Update a table
table.description = "Updated description"
updated = Table.update(table.id, table)
# Delete a table
Table.delete("table-id")
# Delete with options
Table.delete("table-id", recursive=True, hard_delete=True)
# Export/Import CSV
csv_data = Table.export_csv("table-name")
Table.import_csv(csv_data, dry_run=False)
Supported Entity Types
The SDK provides the same fluent API for all OpenMetadata entity types:
- Data Assets: Table, Database, DatabaseSchema, Dashboard, Pipeline, Topic, Container, Query, StoredProcedure, DashboardDataModel, SearchIndex, MlModel, Report
- Services: DatabaseService, MessagingService, DashboardService, PipelineService, MlModelService, StorageService, SearchService, MetadataService, ApiService
- Teams & Users: User, Team, Role, Policy
- Governance: Glossary, GlossaryTerm, Classification, Tag, DataProduct, Domain
- Quality: TestCase, TestSuite, TestDefinition, DataQualityDashboard
- Ingestion: Ingestion, Workflow, Connection
- Other: Type, Webhook, Kpi, Application, Persona, DocStore, Page, SearchQuery
Testing
Run the SDK tests:
# Run all SDK tests
pytest tests/unit/sdk/
# Run specific test
pytest tests/unit/sdk/test_sdk_entities.py
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file hera2_sdk-1.11.8.5.tar.gz.
File metadata
- Download URL: hera2_sdk-1.11.8.5.tar.gz
- Upload date:
- Size: 55.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
466d9588842e13515c033ea7c6b75a56c0ef7c38c14b7da0fc86898b8f9b5721
|
|
| MD5 |
b302a18b133b5a68e63068962a9008fe
|
|
| BLAKE2b-256 |
0304f5dcc81d54defb800192bc83f829cb034523938d269cf186529c7ebe32b0
|
File details
Details for the file hera2_sdk-1.11.8.5-py3-none-any.whl.
File metadata
- Download URL: hera2_sdk-1.11.8.5-py3-none-any.whl
- Upload date:
- Size: 75.9 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.9.6
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d35e037061f971e9adf0069068be7476967e083a825d9e138b0df0e05cdb6fe6
|
|
| MD5 |
5f6065ed3a7f9288ab4a19b5945ce268
|
|
| BLAKE2b-256 |
c8776c02f44a55cc8295265650409402d4028b4af826df067b4bdea18336efc4
|