Skip to main content

StarRocks database adapter for Datus

Project description

datus-starrocks

StarRocks database adapter for Datus.

Overview

StarRocks is a high-performance analytical database that uses the MySQL protocol. This adapter extends the MySQL connector with StarRocks-specific features:

  • Multi-catalog support
  • Materialized views
  • StarRocks-specific metadata queries

Installation

pip install datus-starrocks

This will automatically install the required dependencies:

  • datus-agent
  • datus-mysql (which includes datus-sqlalchemy)

Usage

The adapter is automatically registered with Datus when installed. Configure your database connection:

database:
  type: starrocks
  host: localhost
  port: 9030
  username: root
  password: your_password
  catalog: default_catalog
  database: your_database

Or use programmatically:

from datus_starrocks import StarRocksConnector

# Create connector
connector = StarRocksConnector(
    host="localhost",
    port=9030,
    user="root",
    password="your_password",
    catalog="default_catalog",
    database="mydb"
)

# Use context manager for automatic cleanup
with connector:
    # Test connection
    connector.test_connection()

    # Get catalogs
    catalogs = connector.get_catalogs()
    print(f"Catalogs: {catalogs}")

    # Get databases in catalog
    databases = connector.get_databases(catalog_name="default_catalog")
    print(f"Databases: {databases}")

    # Get tables
    tables = connector.get_tables(catalog_name="default_catalog", database_name="mydb")
    print(f"Tables: {tables}")

    # Get materialized views
    mvs = connector.get_materialized_views(database_name="mydb")
    print(f"Materialized Views: {mvs}")

    # Get materialized views with DDL
    mvs_with_ddl = connector.get_materialized_views_with_ddl(database_name="mydb")
    for mv in mvs_with_ddl:
        print(f"\n{mv['table_name']}:")
        print(mv['definition'])

    # Execute query
    result = connector.execute_query("SELECT * FROM users LIMIT 10")
    print(result.sql_return)

Features

StarRocks-Specific Features

  • Multi-catalog support: Query across multiple catalogs
  • Materialized views: Full support for StarRocks materialized views
  • Catalog management: Switch between catalogs seamlessly

Inherited from MySQL

  • Full CRUD operations (SELECT, INSERT, UPDATE, DELETE)
  • DDL execution (CREATE, ALTER, DROP)
  • Metadata retrieval (tables, views, schemas)
  • Sample data extraction
  • Multiple result formats (pandas, arrow, csv, list)
  • Connection pooling and management

StarRocks-Specific Examples

Working with Catalogs

# List all catalogs
catalogs = connector.get_catalogs()

# Switch catalog
connector.switch_context(catalog_name="hive_catalog")

# Query with explicit catalog
tables = connector.get_tables(
    catalog_name="hive_catalog",
    database_name="my_hive_db"
)

Materialized Views

# Get materialized views
mvs = connector.get_materialized_views(database_name="mydb")

# Get materialized views with full DDL
mvs_with_ddl = connector.get_materialized_views_with_ddl(database_name="mydb")

for mv in mvs_with_ddl:
    print(f"Name: {mv['table_name']}")
    print(f"Database: {mv['database_name']}")
    print(f"Catalog: {mv['catalog_name']}")
    print(f"Definition: {mv['definition']}")

Fully-Qualified Names

StarRocks supports three-part names: catalog.database.table

# Build full name
full_name = connector.full_name(
    catalog_name="default_catalog",
    database_name="mydb",
    table_name="users"
)
# Result: `default_catalog`.`mydb`.`users`

# Query with full name
result = connector.execute_query(f"SELECT * FROM {full_name} LIMIT 10")

Requirements

  • Python >= 3.12
  • StarRocks >= 2.0
  • datus-agent >= 0.2.1
  • datus-mysql >= 0.1.0

Testing

Quick Start

# 1. Start StarRocks test container
docker compose up -d && sleep 60
docker exec datus-starrocks-test mysql -h127.0.0.1 -P9030 -uroot \
  -e "CREATE DATABASE IF NOT EXISTS test;"

# 2. Run tests
./scripts/test.sh unit         # Unit tests (60 tests, ~0.03s)
./scripts/test.sh integration  # Integration tests (35 tests, ~1.5s)
./scripts/test.sh acceptance   # Acceptance tests (28 tests, CI subset)
./scripts/test.sh all          # All tests

TPC-H Integration Tests

TPC-H integration tests use a simplified TPC-H dataset (5 tables: region, nation, customer, orders, supplier) to validate end-to-end query execution, JOIN operations, aggregations, and multi-format output.

# Start StarRocks and create test database
docker compose up -d && sleep 60
docker exec datus-starrocks-test mysql -h127.0.0.1 -P9030 -uroot \
  -e "CREATE DATABASE IF NOT EXISTS test;"

# Initialize TPC-H test data
uv run python scripts/init_tpch_data.py

# Run TPC-H integration tests
uv run pytest tests/integration/test_tpch.py -m integration -v

# Clean re-init (drop and recreate tables)
uv run python scripts/init_tpch_data.py --drop

TPC-H Tables:

Table Rows Description
tpch_region 5 Standard TPC-H regions
tpch_nation 25 Standard TPC-H nations
tpch_customer 10 Simplified customer data
tpch_orders 15 Simplified order data
tpch_supplier 5 Simplified supplier data

Tables use ENGINE=OLAP with PRIMARY KEY and DISTRIBUTED BY HASH for StarRocks-optimized storage.

Test Types

  • Unit tests (60): Configuration and connector logic with Mocks (no database needed)
  • Integration tests (35+): Real database operations (catalog, materialized views, SQL)
  • TPC-H tests (11): Metadata, queries, JOINs, aggregations, multi-format output
  • Acceptance tests (28): Critical functionality subset for CI/CD

Environment Variables

Tests use these default values (automatically set by ./scripts/test.sh):

Variable Default Description
STARROCKS_HOST localhost StarRocks host
STARROCKS_PORT 9030 MySQL protocol port
STARROCKS_USER root Username
STARROCKS_PASSWORD (empty) Password
STARROCKS_CATALOG default_catalog Catalog
STARROCKS_DATABASE test Database

Code Structure

datus-starrocks/
├── datus_starrocks/
│   ├── __init__.py          # Package exports
│   ├── config.py            # StarRocksConfig model
│   └── connector.py         # StarRocksConnector (extends MySQLConnector)
├── tests/
│   ├── unit/                # Unit tests (no database required)
│   └── integration/
│       ├── conftest.py      # Fixtures (config, connector, tpch_setup)
│       ├── test_connector.py # Core integration tests
│       └── test_tpch.py     # TPC-H benchmark tests
├── scripts/
│   ├── test.sh              # Test runner script
│   └── init_tpch_data.py    # Manual TPC-H data initialization
├── docker-compose.yml       # StarRocks 3.3.0 test container
├── pyproject.toml
└── README.md

Connection Cleanup

The connector includes special handling for PyMySQL cleanup errors that can occur with StarRocks connections. Use the context manager pattern for automatic cleanup:

with StarRocksConnector(...) as connector:
    # Your code here
    pass
# Connection automatically cleaned up

License

Apache License 2.0

Related Packages

  • datus-mysql - MySQL adapter (base for StarRocks)
  • datus-sqlalchemy - SQLAlchemy base connector
  • datus-snowflake - Snowflake adapter

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datus_starrocks-0.1.3.tar.gz (20.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datus_starrocks-0.1.3-py3-none-any.whl (8.7 kB view details)

Uploaded Python 3

File details

Details for the file datus_starrocks-0.1.3.tar.gz.

File metadata

  • Download URL: datus_starrocks-0.1.3.tar.gz
  • Upload date:
  • Size: 20.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.4

File hashes

Hashes for datus_starrocks-0.1.3.tar.gz
Algorithm Hash digest
SHA256 c7393abeac7c1122ef3ae289d6581e19d9e05631036d38f2fec402f4797f95b5
MD5 0af7e6a8b6e9a476e7a9cf6a6843d803
BLAKE2b-256 be7dcbe400cead3bf3aa3a3b3e84384a01a3fe2a23677981ab5fdc25b51fec3e

See more details on using hashes here.

File details

Details for the file datus_starrocks-0.1.3-py3-none-any.whl.

File metadata

File hashes

Hashes for datus_starrocks-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 08d1a82a3dd4a96cad1d23f4de8d33bd49490ae9af7b1db63e7e226276e10b02
MD5 22bd3788882996963c7bad238d3368cf
BLAKE2b-256 d36426c583568dfdc5ab49469c8f540522dc549e224702cb589b0020e6dec9fa

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page