Python SQLAlchemy Dialect for StarRocks with optional Alembic integration
Project description
StarRocks Python Client
A StarRocks client for the Python programming language, including a SQLAlchemy dialect and an Alembic extension.
StarRocks is a next-generation data platform designed for fast,real-time analytics. This package allows developers to interact with StarRocks using Python, leveraging SQLAlchemy's powerful ORM and expression language, and managing database schema migrations with Alembic.
Quick Start
Installation
pip install starrocks
Supported Python Versions
Python >= 3.10, <= 3.14
Using a Virtual Environment (Recommended)
It is highly recommended to install starrocks in a virtual environment to avoid conflicts with system-wide packages.
Mac/Linux:
pip install virtualenv
virtualenv <your-env-name>
source <your-env-name>/bin/activate
<your-env-name>/bin/pip install starrocks
Windows:
pip install virtualenv
virtualenv <your-env-name>
<your-env-name>\Scripts\activate
<your-env-name>\Scripts\pip.exe install starrocks
Basic SQLAlchemy Usage
To connect to StarRocks, use the SQLAlchemy connection string format:
starrocks://<User>:<Password>@<Host>:<Port>/[<Catalog>.]<Database>
Or, for an asynchronous connection, use asyncmy driver:
starrocks+asyncmy://<User>:<Password>@<Host>:<Port>/[<Catalog>.]<Database>
- User: User Name
- Password: DBPassword
- Host: StarRocks FE Host
- Catalog: Catalog Name
- Database: Database Name
- Port: StarRocks FE port
Note: The
Catalogcan be omitted and is managed by StarRocks. The default isdefault_catalog.
Example: Basic Operations
Connect to your database and do a query.
from sqlalchemy import create_engine, text
engine = create_engine('starrocks://root@localhost:9030/mydatabase')
# make sure you have created the table `mytable` in `mydatabase`.
with engine.connect() as connection:
print("Connection successful!")
rows = connection.execute(text("SELECT * FROM mytable LIMIT 2")).fetchall()
print(rows)
# async version
import asyncio
from sqlalchemy.ext.asyncio import create_async_engine
engine = create_async_engine('starrocks+asyncmy://root@localhost:9030/mydatabase')
async def async_query():
async with engine.connect() as connection:
print("Connection successful!")
rows = await connection.execute(text("SELECT * FROM mytable LIMIT 2")).fetchall()
print(rows)
await engine.dispose()
asyncio.run(async_query())
Example: Defining a Table (ORM Style)
You can define a table with StarRocks-specific attributes using SQLAlchemy's ORM declarative style: orm-quickstart.
from sqlalchemy import Column
from sqlalchemy.orm import Mapped, declarative_base, mapped_column
from starrocks import INTEGER, STRING
Base = declarative_base()
class MyTable(Base):
__tablename__ = 'my_orm_table'
id: Mapped[int] = mapped_column(INTEGER, primary_key=True)
name: Mapped[str] = mapped_column(STRING)
__table_args__ = {
'comment': 'table comment',
'starrocks_primary_key': 'id',
'starrocks_distributed_by': 'HASH(id) BUCKETS 10',
'starrocks_properties': {'replication_num': '1'}
}
# Create the table in the database
Base.metadata.create_all(engine)
Example: Defining a Table (Core Style)
Alternatively, you can use SQLAlchemy Core to define tables programmatically.
from sqlalchemy import Column, MetaData, Table
from starrocks import INTEGER, VARCHAR
metadata = MetaData()
my_core_table = Table(
'my_core_table',
metadata,
Column('id', INTEGER, primary_key=True),
Column('name', VARCHAR(50)),
# StarRocks-specific arguments
starrocks_primary_key='id',
starrocks_distributed_by='HASH(id) BUCKETS 10',
starrocks_properties={"replication_num": "1"}
)
# Create the table in the database
metadata.create_all(engine)
For a complete guide on defining tables (Core and ORM), executing queries, and using advanced features, please see the SQLAlchemy Usage Guide.
For a detailed reference on all StarRocks-specific table attributes and data types, please see the Table Definition Reference.
Example: Views and Materialized Views
Create a View and a Materialized View using the StarRocks helpers. These behave like SQLAlchemy Table objects and are created with metadata.create_all(engine).
from sqlalchemy import MetaData, text
from starrocks.sql.schema import View, MaterializedView
metadata = MetaData()
# Create a simple View (columns inferred from SELECT)
user_view = View(
'user_view',
metadata,
definition='SELECT id, name FROM my_core_table WHERE name IS NOT NULL',
comment='Active users'
)
# Create a simple Materialized View (asynchronous refresh)
user_stats_mv = MaterializedView(
'user_stats_mv',
metadata,
definition='SELECT id, COUNT(*) AS cnt FROM my_core_table GROUP BY id',
starrocks_refresh='ASYNC'
)
# Create the view and MV in the database
metadata.create_all(engine)
# Query the view or MV like normal tables
with engine.connect() as conn:
rows = conn.execute(text("SELECT * FROM user_view LIMIT 5")).fetchall()
print(rows)
You can refer to Views Definition Reference and Materialized View Definition Reference for more detailed information.
Alembic Integration for Schema Migrations
This dialect integrates with Alembic to support automated schema migrations. Here’s a quick-start guide to get you up and running.
Generate models from an existing database (Optional)
If you already have tables/views/materialized views in your StarRocks database, you can generate models.py (or a consolidated models file) using sqlacodegen.
sqlacodegen --options include_dialect_options,keep_dialect_types \
starrocks://root@localhost:9030 > models.py
Refer to generating models and sqlacodegen for more options and features.
1. Install and Initialize Alembic
pip install "alembic>=1.16"
alembic init alembic
2. Configure your Database URL, Logging Info
In alembic.ini, set the sqlalchemy.url to your StarRocks connection string.
# alembic.ini
sqlalchemy.url = starrocks://root@localhost:9030/mydatabase
It's better to print the log from this starrocks-sqlalchemy when runing alembic command. You can add following logging configration in the alembic.ini file.
# alembic.ini
[loggers]
# Append starrocks model at the following line
# keys = root,sqlalchemy,alembic
keys = root,sqlalchemy,alembic,starrocks
# Add following lines after `[logger_alembic]` section
[logger_starrocks]
level = INFO
handlers =
qualname = starrocks
3. Configure your Models for Autogeneration
In alembic/env.py, import your models' metadata and assign it to target_metadata.
# alembic/env.py
# Add these imports
from myapp.models import Base # Adjust to your models' location
from starrocks.alembic import render_column_type, include_object_for_view_mv
# ...
# And set the target_metadata
target_metadata = Base.metadata
def run_migrations_online() -> None:
# ... inside this function
context.configure(
# ...
render_item=render_column_type, # Add this line (required for column comparison)
include_object=include_object_for_view_mv # Add this line (required for View/MV support)
)
# ...
Note: For advanced filtering options (e.g., excluding temporary tables), see the Alembic Integration Guide.
4. Generate and Apply Your First Migration
With your models defined (as shown in the SQLAlchemy examples above), you can now generate and apply a migration.
# Generate the migration script
alembic revision --autogenerate -m "Create initial tables"
# Apply the migration to the database
alembic upgrade head
For a full tutorial on advanced topics like data migrations, handling complex types, and managing views, please refer to the Alembic Integration Guide.
Contributing
Tests
Running Unit Tests
To run tests for the StarRocks SQLAlchemy dialect, first install the package in editable mode along with its testing dependencies:
pip install -e .
pip install pytest mock
Then, you can run the test suite using pytest:
pytest
This will run the standard SQLAlchemy dialect test suite as well as StarRocks-specific tests. For more details, please check SQLAlchemy's guide for dialect development.
Test Logging
To see the raw SQL that the dialect compiles and executes during tests, you can modify the [tool.pytest.ini_options] section in the pyproject.toml file in your project with the following content:
[tool.pytest.ini_options]
log_cli = true
log_cli_level = DEBUG
log_cli_format = %(levelname)-5.5s [%(name)s] %(message)s
Running Integration and System Tests
To run the integration and system tests, you must have a running StarRocks cluster. The tests require a connection URL to be provided via the STARROCKS_URL environment variable.
-
Set up your StarRocks database: Ensure your StarRocks instance is running and you have a database available for testing (e.g.,
test_sqla).CREATE DATABASE IF NOT EXISTS test_sqla; -- set it if you're testing cases on small shared-nothing clusters ADMIN SET FRONTEND CONFIG ("default_replication_num" = "1");
-
Configure the connection URL: Set the
STARROCKS_URLenvironment variable in your shell. The format should be:export STARROCKS_URL="starrocks://<User>:<Password>@<Host>:<Port>/<Database>"
For example (the default url will be this if you don't set it):
export STARROCKS_URL="starrocks://root@127.0.0.1:9030/test_sqla"
-
Run the tests: With the development environment set up and the environment variable configured, you can run the tests using
pytest:pytest
Running the Full SQLAlchemy Test Suite
In addition to the StarRocks-specific tests, you can run the comprehensive test suite provided by SQLAlchemy to ensure full compatibility. To do this, you need to enable the SQLAlchemy test plugin in test/conftest.py.
-
Enable the SQLAlchemy test plugin: Open the file
test/conftest.pyand uncomment the line that imports fromsqlalchemy.testing.plugin.pytestplugin.# test/conftest.py # ... other imports # To run the test_suite.py and full SQLAlchemy test suite, uncomment the following line: from sqlalchemy.testing.plugin.pytestplugin import *
NOTE: If you enable this line, StarRocks-specific tests will not run. We will modify the StarRocks-specific tests using SQLAlchemy's test framework in the future.
You can enable SQLAlchemy's tests from the file
test/test_suite.pyby uncomment the following line:# To run the full SQLAlchemy test suite, uncomment the following line: from sqlalchemy.testing.suite import *
-
Run the tests:
pytest test/test_suite.py
This will run the standard SQLAlchemy dialect test suite as well as StarRocks-specific tests. For more details, please check SQLAlchemy's guide for dialect development.
Note: After running the full suite, it's good practice to re-comment the line in test/conftest.py to keep standard test runs focused on the dialect-specific tests.
Build and Deploy
To build and deploy the package, you'll need the build and twine tools. You can install them using pip:
pip install build twine
Once installed, follow these steps to build and release a new version:
-
Build the package:
This command packages your project into distribution files (
.tar.gzfor source and.whlfor a built distribution) and places them in thedist/directory.python3 -m build
The main outputs are
dist/starrocks-$version.tar.gzanddist/starrocks-$version-py3-none-any.whl. -
Upload to the test environment (TestPyPI):
Before publishing to the official PyPI, it's a good practice to upload to TestPyPI to ensure everything works as expected.
python3 -m twine upload --repository testpypi dist/*
-
Test the package from TestPyPI:
Install the package from TestPyPI to verify that it was uploaded correctly and can be installed by others.
python3 -m pip install --index-url https://test.pypi.org/simple/ starrocks
-
Upload to the production environment (PyPI):
After verifying the package on TestPyPI, upload it to the official Python Package Index (PyPI) to make it publicly available.
twine upload dist/*
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file starrocks-1.3.3.tar.gz.
File metadata
- Download URL: starrocks-1.3.3.tar.gz
- Upload date:
- Size: 104.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
ee370e4566636169370a066e1f06b1caef1ddb3c26a43a3211d1de8c428e87db
|
|
| MD5 |
42c7c42277aedfa4cc93a7728f6bf5d9
|
|
| BLAKE2b-256 |
db35c99b2eb4f5a94bd5c8e8b8bed4bdd782f8f4b14c80a39cd71d1cfd9f3a49
|
File details
Details for the file starrocks-1.3.3-py3-none-any.whl.
File metadata
- Download URL: starrocks-1.3.3-py3-none-any.whl
- Upload date:
- Size: 109.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.10.11
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
b3ff89ed838d7f9ce07c7f97bd175d71e3539252a455314b6826636085573324
|
|
| MD5 |
a630d43b21f7560b352ae194d806a62a
|
|
| BLAKE2b-256 |
93b50d81ff72dfd5dabd844b0d8f1a50055e1b4671e3a72588c66f2e19a9ee9b
|