ClickZetta database adapter for Datus
Project description
Datus ClickZetta Adapter
This package provides a ClickZetta Lakehouse adapter for Datus, enabling seamless integration with ClickZetta analytics platform.
ClickZetta is developed by Singdata and Yunqi.
Installation
pip install datus-clickzetta
Dependencies
This adapter requires the following ClickZetta Python packages:
clickzetta-connector-pythonclickzetta-zettapark-python
Configuration
Configure ClickZetta connection in your Datus configuration. A complete example is available at examples/agent.clickzetta.yml.example.
namespace:
clickzetta_prod:
type: clickzetta
service: "your-service-endpoint.clickzetta.com"
username: "your-username"
password: "your-password"
instance: "your-instance-id"
workspace: "your-workspace"
schema: "PUBLIC" # optional, defaults to PUBLIC
vcluster: "DEFAULT_AP" # optional, defaults to DEFAULT_AP
secure: false # optional
Configuration Parameters
| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
service |
string | Yes | - | ClickZetta service endpoint |
username |
string | Yes | - | ClickZetta username |
password |
string | Yes | - | ClickZetta password |
instance |
string | Yes | - | ClickZetta instance identifier |
workspace |
string | Yes | - | ClickZetta workspace name |
schema |
string | No | "PUBLIC" | Default schema name |
vcluster |
string | No | "DEFAULT_AP" | Virtual cluster name |
secure |
boolean | No | null | Enable secure connection |
hints |
object | No | {} | Additional connection hints |
extra |
object | No | {} | Extra connection parameters |
Features
- Full SQL Support: Execute queries, DDL, DML operations
- Metadata Discovery: Automatic discovery of databases, schemas, tables, and views
- Volume Integration: Read files from ClickZetta volumes
- Sample Data: Extract sample rows for data profiling
- Connection Management: Automatic connection pooling and session management
Usage
Once installed and configured, use the ClickZetta adapter with Datus:
# Execute queries
result = agent.query("SELECT * FROM my_table LIMIT 10")
# Get table information
tables = agent.get_tables("my_schema")
Volume Operations
The adapter supports reading files from ClickZetta volumes:
# Read a file from a volume
content = connector.read_volume_file("volume:user://my_volume", "path/to/file.yaml")
# List files in a volume directory
files = connector.list_volume_files("volume:user://my_volume", "config/", suffixes=(".yaml", ".yml"))
Connection Hints
You can customize ClickZetta connection behavior using hints:
namespace:
clickzetta_prod:
type: clickzetta
# ... other connection parameters
hints:
sdk.job.timeout: 600
query_tag: "Datus Analytics Query"
cz.storage.parquet.vector.index.read.memory.cache: "true"
Error Handling
The adapter provides comprehensive error handling with detailed error messages for common issues:
- Connection failures
- Authentication errors
- Query execution errors
- Schema/workspace switching limitations
Development
Development Mode Setup (Complete Guide)
This guide covers the complete setup from Datus agent installation to ClickZetta adapter development and testing.
Prerequisites
- Python 3.11+ recommended
- Git
Step 1: Setup Datus Agent Development Environment
# Clone the Datus agent repository
git clone https://github.com/Datus-ai/datus-agent.git
cd datus-agent
# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install Datus agent in editable mode
pip install -e .
Step 2: Clone and Install ClickZetta Adapter
# From your development directory
git clone https://github.com/Datus-ai/datus-db-adapters.git
cd datus-db-adapters/datus-clickzetta
# Install ClickZetta adapter in editable mode (using the same virtual environment)
pip install -e .
# Verify installation
pip show datus-clickzetta
Step 3: Configure Environment Variables
Create a .env file or set environment variables:
# ClickZetta connection settings
export CLICKZETTA_SERVICE="your-service.clickzetta.com"
export CLICKZETTA_USERNAME="your-username"
export CLICKZETTA_PASSWORD="your-password"
export CLICKZETTA_INSTANCE="your-instance-id"
export CLICKZETTA_WORKSPACE="your-workspace"
export CLICKZETTA_SCHEMA="your-schema"
export CLICKZETTA_VCLUSTER="default_ap"
# LLM API keys (optional for testing)
export DASHSCOPE_API_KEY="your-dashscope-key"
export DEEPSEEK_API_KEY="your-deepseek-key"
Step 4: Create ClickZetta Configuration
In your Datus agent directory, create a ClickZetta configuration file using the provided example:
# In datus-agent directory
cp ../datus-db-adapters/datus-clickzetta/examples/agent.clickzetta.yml.example conf/agent.clickzetta.yml
Update conf/agent.clickzetta.yml with ClickZetta settings:
agent:
target: qwen_main # or your preferred model
home: .datus_home
models:
# Your model configurations here
namespace:
clickzetta:
type: clickzetta
service: ${CLICKZETTA_SERVICE}
username: ${CLICKZETTA_USERNAME}
password: ${CLICKZETTA_PASSWORD}
instance: ${CLICKZETTA_INSTANCE}
workspace: ${CLICKZETTA_WORKSPACE}
schema: ${CLICKZETTA_SCHEMA}
vcluster: ${CLICKZETTA_VCLUSTER}
secure: false
Step 5: Start Development and Testing
Test the adapter directly:
# From datus-clickzetta directory
python -c "
from datus_clickzetta.connector import ClickZettaConnector
import os
connector = ClickZettaConnector(
service=os.getenv('CLICKZETTA_SERVICE'),
username=os.getenv('CLICKZETTA_USERNAME'),
password=os.getenv('CLICKZETTA_PASSWORD'),
instance=os.getenv('CLICKZETTA_INSTANCE'),
workspace=os.getenv('CLICKZETTA_WORKSPACE'),
schema=os.getenv('CLICKZETTA_SCHEMA'),
vcluster=os.getenv('CLICKZETTA_VCLUSTER'),
secure=False
)
result = connector.execute('SHOW SCHEMAS')
print(f'Connected! Found {result.row_count} schemas')
"
Start Datus CLI with ClickZetta:
# From datus-agent directory
python -m datus.cli.main --config conf/agent.clickzetta.yml --namespace clickzetta
Step 6: Development Workflow
Making Changes:
- Edit code in
datus-clickzetta/datus_clickzetta/connector.py - Changes are immediately available (editable install)
- No need to reinstall the package
Testing Changes:
# Run adapter tests
cd datus-clickzetta
python test.py
# Test with Datus CLI
cd ../datus-agent
python -m datus.cli.main --config conf/agent.clickzetta.yml --namespace clickzetta
Commit and Push:
# From adapter directory
git add .
git commit -m "Your changes"
git push origin your-branch
# From agent directory (if you made agent changes)
git add .
git commit -m "Your agent changes"
git push origin your-branch
Directory Structure
your-dev-folder/
├── datus-agent/ # Datus agent repository
│ ├── .venv/ # Shared virtual environment
│ ├── conf/agent.clickzetta.yml # ClickZetta configuration
│ └── ...
└── datus-db-adapters/ # Adapters repository
└── datus-clickzetta/ # ClickZetta adapter
├── datus_clickzetta/
│ └── connector.py # Main connector code
└── ...
Tips for Development
- Editable Installs: Both packages are installed in editable mode, so code changes are immediate
- Environment Variables: Use
.envfiles for local development, environment variables for production - Testing: Always test both the adapter directly and through the Datus CLI
- Debugging: Use
logger.debug()statements; enable withDATUS_LOG_LEVEL=DEBUG
Contributing Guidelines
- Clone the repository
- Create a feature branch
- Make your changes
- Run tests:
python test.py - Ensure code style compliance
- Submit a pull request
Common Development Issues
Import Errors:
- Ensure both packages are installed in editable mode
- Check virtual environment is activated
Connection Issues:
- Verify environment variables are set
- Test connection with the direct connector test above
CLI Issues:
- Check configuration file syntax
- Verify namespace configuration matches your environment
Testing
This adapter includes comprehensive test coverage with multiple test types and execution modes.
Test Structure
tests/
├── unit/ # Unit tests for individual components
├── integration/
│ ├── conftest.py # TPC-H fixtures and test data
│ ├── test_connector_integration.py # Connector integration tests
│ └── test_tpch.py # TPC-H benchmark tests
├── run_tests.py # Main test runner with multiple modes
├── comprehensive_test.py # Real connection testing script
└── conftest.py # Shared test fixtures and configuration
Running Tests
Quick Start (from project root):
# Run all tests
python test.py
# Run specific test types
python test.py --mode unit # Unit tests only (fastest)
python test.py --mode integration # Integration tests only
python test.py --mode all # All tests
python test.py --mode coverage # Tests with coverage report
Advanced Usage (from tests/ directory):
cd tests
# Basic test execution
python run_tests.py --mode unit
python run_tests.py --mode integration -v
# Real connection testing (requires credentials)
python comprehensive_test.py
# Direct pytest usage
pytest unit/ # Unit tests
pytest integration/ # Integration tests
pytest -k "test_config" # Specific test patterns
TPC-H Integration Tests
TPC-H integration tests use a simplified TPC-H dataset (5 tables: region, nation, customer, orders, supplier) to validate end-to-end query execution, JOIN operations, aggregations, and multi-format output.
# Set ClickZetta credentials
export CLICKZETTA_SERVICE="your-service.clickzetta.com"
export CLICKZETTA_USERNAME="your-username"
export CLICKZETTA_PASSWORD="your-password"
export CLICKZETTA_INSTANCE="your-instance"
export CLICKZETTA_WORKSPACE="your-workspace"
# Initialize TPC-H test data
uv run python scripts/init_tpch_data.py
# Run TPC-H integration tests
uv run pytest tests/integration/test_tpch.py -m integration -v
# Clean re-init (drop and recreate tables)
uv run python scripts/init_tpch_data.py --drop
TPC-H Tables:
| Table | Rows | Description |
|---|---|---|
tpch_region |
5 | Standard TPC-H regions |
tpch_nation |
25 | Standard TPC-H nations |
tpch_customer |
10 | Simplified customer data |
tpch_orders |
15 | Simplified order data |
tpch_supplier |
5 | Simplified supplier data |
Test Requirements
- Unit Tests: No external dependencies, run with mocked components
- Integration Tests: Mocked ClickZetta SDK, test connector logic
- TPC-H Tests: Require actual ClickZetta credentials
- Real Connection Tests: Require actual ClickZetta credentials
Environment Variables:
| Variable | Default | Description |
|---|---|---|
CLICKZETTA_SERVICE |
(required) | ClickZetta service endpoint |
CLICKZETTA_USERNAME |
(required) | ClickZetta username |
CLICKZETTA_PASSWORD |
(required) | ClickZetta password |
CLICKZETTA_INSTANCE |
(required) | ClickZetta instance ID |
CLICKZETTA_WORKSPACE |
(required) | ClickZetta workspace |
CLICKZETTA_SCHEMA |
PUBLIC |
Default schema |
CLICKZETTA_VCLUSTER |
DEFAULT_AP |
Virtual cluster |
Test Coverage
- Configuration validation and error handling
- SQL query execution and result processing
- Metadata discovery (tables, views, schemas)
- Connection management and lifecycle
- Volume operations and file listing
- Error handling and exception cases
- TPC-H benchmark queries (JOINs, aggregations, multi-format output)
License
This project is licensed under the Apache 2.0 License - see the LICENSE file for details.
Support
For issues and questions:
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file datus_clickzetta-0.1.2.tar.gz.
File metadata
- Download URL: datus_clickzetta-0.1.2.tar.gz
- Upload date:
- Size: 35.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
3b2fc8e1d0bc6e185509034781dc20236a006847bba981d5e5655e0cdadf5ba6
|
|
| MD5 |
93e3cafa3c32b2e4fc598ef4ef9f0054
|
|
| BLAKE2b-256 |
ca5c2578d77eebc4d8aa5ee15b4b49a4eb2d9a610f95f57f274561bb51aa3582
|
File details
Details for the file datus_clickzetta-0.1.2-py3-none-any.whl.
File metadata
- Download URL: datus_clickzetta-0.1.2-py3-none-any.whl
- Upload date:
- Size: 15.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.7.4
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
bfab97359a6f98308d0167a60bfad8f65cb91917396c75e279c3eb4a8c36a6f9
|
|
| MD5 |
e8797d29937aade5d6e5fd2011faf389
|
|
| BLAKE2b-256 |
eab4c55b9f2b1fc179d9c002ba80175aacb2064f7e0bd81670756f8edebcfb4b
|