Skip to main content

StarRocks database adapter for Datus

Project description

datus-starrocks

StarRocks database adapter for Datus.

Overview

StarRocks is a high-performance analytical database that uses the MySQL protocol. This adapter extends the MySQL connector with StarRocks-specific features:

  • Multi-catalog support
  • Materialized views
  • StarRocks-specific metadata queries

Installation

pip install datus-starrocks

This will automatically install the required dependencies:

  • datus-agent
  • datus-mysql (which includes datus-sqlalchemy)

Usage

The adapter is automatically registered with Datus when installed. Configure your database connection:

database:
  type: starrocks
  host: localhost
  port: 9030
  username: root
  password: your_password
  catalog: default_catalog
  database: your_database

Or use programmatically:

from datus_starrocks import StarRocksConnector

# Create connector
connector = StarRocksConnector(
    host="localhost",
    port=9030,
    user="root",
    password="your_password",
    catalog="default_catalog",
    database="mydb"
)

# Use context manager for automatic cleanup
with connector:
    # Test connection
    connector.test_connection()

    # Get catalogs
    catalogs = connector.get_catalogs()
    print(f"Catalogs: {catalogs}")

    # Get databases in catalog
    databases = connector.get_databases(catalog_name="default_catalog")
    print(f"Databases: {databases}")

    # Get tables
    tables = connector.get_tables(catalog_name="default_catalog", database_name="mydb")
    print(f"Tables: {tables}")

    # Get materialized views
    mvs = connector.get_materialized_views(database_name="mydb")
    print(f"Materialized Views: {mvs}")

    # Get materialized views with DDL
    mvs_with_ddl = connector.get_materialized_views_with_ddl(database_name="mydb")
    for mv in mvs_with_ddl:
        print(f"\n{mv['table_name']}:")
        print(mv['definition'])

    # Execute query
    result = connector.execute_query("SELECT * FROM users LIMIT 10")
    print(result.sql_return)

Features

StarRocks-Specific Features

  • Multi-catalog support: Query across multiple catalogs
  • Materialized views: Full support for StarRocks materialized views
  • Catalog management: Switch between catalogs seamlessly

Inherited from MySQL

  • Full CRUD operations (SELECT, INSERT, UPDATE, DELETE)
  • DDL execution (CREATE, ALTER, DROP)
  • Metadata retrieval (tables, views, schemas)
  • Sample data extraction
  • Multiple result formats (pandas, arrow, csv, list)
  • Connection pooling and management

StarRocks-Specific Examples

Working with Catalogs

# List all catalogs
catalogs = connector.get_catalogs()

# Switch catalog
connector.switch_context(catalog_name="hive_catalog")

# Query with explicit catalog
tables = connector.get_tables(
    catalog_name="hive_catalog",
    database_name="my_hive_db"
)

Materialized Views

# Get materialized views
mvs = connector.get_materialized_views(database_name="mydb")

# Get materialized views with full DDL
mvs_with_ddl = connector.get_materialized_views_with_ddl(database_name="mydb")

for mv in mvs_with_ddl:
    print(f"Name: {mv['table_name']}")
    print(f"Database: {mv['database_name']}")
    print(f"Catalog: {mv['catalog_name']}")
    print(f"Definition: {mv['definition']}")

Fully-Qualified Names

StarRocks supports three-part names: catalog.database.table

# Build full name
full_name = connector.full_name(
    catalog_name="default_catalog",
    database_name="mydb",
    table_name="users"
)
# Result: `default_catalog`.`mydb`.`users`

# Query with full name
result = connector.execute_query(f"SELECT * FROM {full_name} LIMIT 10")

Requirements

  • Python >= 3.12
  • StarRocks >= 2.0
  • datus-agent >= 0.2.1
  • datus-mysql >= 0.1.0

Testing

Quick Start

# 1. Start StarRocks test container
docker compose up -d && sleep 60
docker exec datus-starrocks-test mysql -h127.0.0.1 -P9030 -uroot \
  -e "CREATE DATABASE IF NOT EXISTS test;"

# 2. Run tests
./scripts/test.sh unit         # Unit tests (60 tests, ~0.03s)
./scripts/test.sh integration  # Integration tests (35 tests, ~1.5s)
./scripts/test.sh acceptance   # Acceptance tests (28 tests, CI subset)
./scripts/test.sh all          # All tests

TPC-H Integration Tests

TPC-H integration tests use a simplified TPC-H dataset (5 tables: region, nation, customer, orders, supplier) to validate end-to-end query execution, JOIN operations, aggregations, and multi-format output.

# Start StarRocks and create test database
docker compose up -d && sleep 60
docker exec datus-starrocks-test mysql -h127.0.0.1 -P9030 -uroot \
  -e "CREATE DATABASE IF NOT EXISTS test;"

# Initialize TPC-H test data
uv run python scripts/init_tpch_data.py

# Run TPC-H integration tests
uv run pytest tests/integration/test_tpch.py -m integration -v

# Clean re-init (drop and recreate tables)
uv run python scripts/init_tpch_data.py --drop

TPC-H Tables:

Table Rows Description
tpch_region 5 Standard TPC-H regions
tpch_nation 25 Standard TPC-H nations
tpch_customer 10 Simplified customer data
tpch_orders 15 Simplified order data
tpch_supplier 5 Simplified supplier data

Tables use ENGINE=OLAP with PRIMARY KEY and DISTRIBUTED BY HASH for StarRocks-optimized storage.

Test Types

  • Unit tests (60): Configuration and connector logic with Mocks (no database needed)
  • Integration tests (35+): Real database operations (catalog, materialized views, SQL)
  • TPC-H tests (11): Metadata, queries, JOINs, aggregations, multi-format output
  • Acceptance tests (28): Critical functionality subset for CI/CD

Environment Variables

Tests use these default values (automatically set by ./scripts/test.sh):

Variable Default Description
STARROCKS_HOST localhost StarRocks host
STARROCKS_PORT 9030 MySQL protocol port
STARROCKS_USER root Username
STARROCKS_PASSWORD (empty) Password
STARROCKS_CATALOG default_catalog Catalog
STARROCKS_DATABASE test Database

Code Structure

datus-starrocks/
├── datus_starrocks/
│   ├── __init__.py          # Package exports
│   ├── config.py            # StarRocksConfig model
│   └── connector.py         # StarRocksConnector (extends MySQLConnector)
├── tests/
│   ├── unit/                # Unit tests (no database required)
│   └── integration/
│       ├── conftest.py      # Fixtures (config, connector, tpch_setup)
│       ├── test_connector.py # Core integration tests
│       └── test_tpch.py     # TPC-H benchmark tests
├── scripts/
│   ├── test.sh              # Test runner script
│   └── init_tpch_data.py    # Manual TPC-H data initialization
├── docker-compose.yml       # StarRocks 3.3.0 test container
├── pyproject.toml
└── README.md

Connection Cleanup

The connector includes special handling for PyMySQL cleanup errors that can occur with StarRocks connections. Use the context manager pattern for automatic cleanup:

with StarRocksConnector(...) as connector:
    # Your code here
    pass
# Connection automatically cleaned up

License

Apache License 2.0

Related Packages

  • datus-mysql - MySQL adapter (base for StarRocks)
  • datus-sqlalchemy - SQLAlchemy base connector
  • datus-snowflake - Snowflake adapter

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datus_starrocks-0.1.6rc1.tar.gz (21.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datus_starrocks-0.1.6rc1-py3-none-any.whl (10.6 kB view details)

Uploaded Python 3

File details

Details for the file datus_starrocks-0.1.6rc1.tar.gz.

File metadata

  • Download URL: datus_starrocks-0.1.6rc1.tar.gz
  • Upload date:
  • Size: 21.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.12

File hashes

Hashes for datus_starrocks-0.1.6rc1.tar.gz
Algorithm Hash digest
SHA256 744aca063c18ce9fa77f09158500e1f801ffefd784b24db458d2e09faefae967
MD5 24d8191c706bfd045b0bb783f70ddd4f
BLAKE2b-256 97e1a109b6f2fc3f608d7de644ba67779a0c7218b0591517d248efd8ddc15481

See more details on using hashes here.

File details

Details for the file datus_starrocks-0.1.6rc1-py3-none-any.whl.

File metadata

File hashes

Hashes for datus_starrocks-0.1.6rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 a591cd53c65f2800a9b4000e2d7d4a35eb975a3fa6e13ee0fc7a97abd95e9c8b
MD5 c4889e282b5ac3f968f0e8b56a8c11b6
BLAKE2b-256 4a956f7bb79f02f0bce82d6bf8af9dbb9aeb8468d100dd75b806cb8443c72584

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page