Skip to main content

ClickZetta database adapter for Datus

Project description

Datus ClickZetta Adapter

This package provides a ClickZetta Lakehouse adapter for Datus, enabling seamless integration with ClickZetta analytics platform.

ClickZetta is developed by Singdata and Yunqi.

Installation

pip install datus-clickzetta

Dependencies

This adapter requires the following ClickZetta Python packages:

  • clickzetta-connector-python
  • clickzetta-zettapark-python

Configuration

Configure ClickZetta connection in your Datus configuration. A complete example is available at examples/agent.clickzetta.yml.example.

namespace:
  clickzetta_prod:
    type: clickzetta
    service: "your-service-endpoint.clickzetta.com"
    username: "your-username"
    password: "your-password"
    instance: "your-instance-id"
    workspace: "your-workspace"
    schema: "PUBLIC"  # optional, defaults to PUBLIC
    vcluster: "DEFAULT_AP"  # optional, defaults to DEFAULT_AP
    secure: false  # optional

Configuration Parameters

Parameter Type Required Default Description
service string Yes - ClickZetta service endpoint
username string Yes - ClickZetta username
password string Yes - ClickZetta password
instance string Yes - ClickZetta instance identifier
workspace string Yes - ClickZetta workspace name
schema string No "PUBLIC" Default schema name
vcluster string No "DEFAULT_AP" Virtual cluster name
secure boolean No null Enable secure connection
hints object No {} Additional connection hints
extra object No {} Extra connection parameters

Features

  • Full SQL Support: Execute queries, DDL, DML operations
  • Metadata Discovery: Automatic discovery of databases, schemas, tables, and views
  • Volume Integration: Read files from ClickZetta volumes
  • Sample Data: Extract sample rows for data profiling
  • Connection Management: Automatic connection pooling and session management

Usage

Once installed and configured, use the ClickZetta adapter with Datus:

# Execute queries
result = agent.query("SELECT * FROM my_table LIMIT 10")

# Get table information
tables = agent.get_tables("my_schema")

Volume Operations

The adapter supports reading files from ClickZetta volumes:

# Read a file from a volume
content = connector.read_volume_file("volume:user://my_volume", "path/to/file.yaml")

# List files in a volume directory
files = connector.list_volume_files("volume:user://my_volume", "config/", suffixes=(".yaml", ".yml"))

Connection Hints

You can customize ClickZetta connection behavior using hints:

namespace:
  clickzetta_prod:
    type: clickzetta
    # ... other connection parameters
    hints:
      sdk.job.timeout: 600
      query_tag: "Datus Analytics Query"
      cz.storage.parquet.vector.index.read.memory.cache: "true"

Error Handling

The adapter provides comprehensive error handling with detailed error messages for common issues:

  • Connection failures
  • Authentication errors
  • Query execution errors
  • Schema/workspace switching limitations

Development

Development Mode Setup (Complete Guide)

This guide covers the complete setup from Datus agent installation to ClickZetta adapter development and testing.

Prerequisites

  • Python 3.11+ recommended
  • Git

Step 1: Setup Datus Agent Development Environment

# Clone the Datus agent repository
git clone https://github.com/Datus-ai/datus-agent.git
cd datus-agent

# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install Datus agent in editable mode
pip install -e .

Step 2: Clone and Install ClickZetta Adapter

# From your development directory
git clone https://github.com/Datus-ai/datus-db-adapters.git
cd datus-db-adapters/datus-clickzetta

# Install ClickZetta adapter in editable mode (using the same virtual environment)
pip install -e .

# Verify installation
pip show datus-clickzetta

Step 3: Configure Environment Variables

Create a .env file or set environment variables:

# ClickZetta connection settings
export CLICKZETTA_SERVICE="your-service.clickzetta.com"
export CLICKZETTA_USERNAME="your-username"
export CLICKZETTA_PASSWORD="your-password"
export CLICKZETTA_INSTANCE="your-instance-id"
export CLICKZETTA_WORKSPACE="your-workspace"
export CLICKZETTA_SCHEMA="your-schema"
export CLICKZETTA_VCLUSTER="default_ap"

# LLM API keys (optional for testing)
export DASHSCOPE_API_KEY="your-dashscope-key"
export DEEPSEEK_API_KEY="your-deepseek-key"

Step 4: Create ClickZetta Configuration

In your Datus agent directory, create a ClickZetta configuration file using the provided example:

# In datus-agent directory
cp ../datus-db-adapters/datus-clickzetta/examples/agent.clickzetta.yml.example conf/agent.clickzetta.yml

Update conf/agent.clickzetta.yml with ClickZetta settings:

agent:
  target: qwen_main  # or your preferred model
  home: .datus_home

  models:
    # Your model configurations here

  namespace:
    clickzetta:
      type: clickzetta
      service: ${CLICKZETTA_SERVICE}
      username: ${CLICKZETTA_USERNAME}
      password: ${CLICKZETTA_PASSWORD}
      instance: ${CLICKZETTA_INSTANCE}
      workspace: ${CLICKZETTA_WORKSPACE}
      schema: ${CLICKZETTA_SCHEMA}
      vcluster: ${CLICKZETTA_VCLUSTER}
      secure: false

Step 5: Start Development and Testing

Test the adapter directly:

# From datus-clickzetta directory
python -c "
from datus_clickzetta.connector import ClickZettaConnector
import os

connector = ClickZettaConnector(
    service=os.getenv('CLICKZETTA_SERVICE'),
    username=os.getenv('CLICKZETTA_USERNAME'),
    password=os.getenv('CLICKZETTA_PASSWORD'),
    instance=os.getenv('CLICKZETTA_INSTANCE'),
    workspace=os.getenv('CLICKZETTA_WORKSPACE'),
    schema=os.getenv('CLICKZETTA_SCHEMA'),
    vcluster=os.getenv('CLICKZETTA_VCLUSTER'),
    secure=False
)

result = connector.execute('SHOW SCHEMAS')
print(f'Connected! Found {result.row_count} schemas')
"

Start Datus CLI with ClickZetta:

# From datus-agent directory
python -m datus.cli.main --config conf/agent.clickzetta.yml --namespace clickzetta

Step 6: Development Workflow

Making Changes:

  1. Edit code in datus-clickzetta/datus_clickzetta/connector.py
  2. Changes are immediately available (editable install)
  3. No need to reinstall the package

Testing Changes:

# Run adapter tests
cd datus-clickzetta
python test.py

# Test with Datus CLI
cd ../datus-agent
python -m datus.cli.main --config conf/agent.clickzetta.yml --namespace clickzetta

Commit and Push:

# From adapter directory
git add .
git commit -m "Your changes"
git push origin your-branch

# From agent directory (if you made agent changes)
git add .
git commit -m "Your agent changes"
git push origin your-branch

Directory Structure

your-dev-folder/
├── datus-agent/                    # Datus agent repository
│   ├── .venv/                     # Shared virtual environment
│   ├── conf/agent.clickzetta.yml  # ClickZetta configuration
│   └── ...
└── datus-db-adapters/             # Adapters repository
    └── datus-clickzetta/          # ClickZetta adapter
        ├── datus_clickzetta/
        │   └── connector.py       # Main connector code
        └── ...

Tips for Development

  • Editable Installs: Both packages are installed in editable mode, so code changes are immediate
  • Environment Variables: Use .env files for local development, environment variables for production
  • Testing: Always test both the adapter directly and through the Datus CLI
  • Debugging: Use logger.debug() statements; enable with DATUS_LOG_LEVEL=DEBUG

Contributing Guidelines

  1. Clone the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: python test.py
  5. Ensure code style compliance
  6. Submit a pull request

Common Development Issues

Import Errors:

  • Ensure both packages are installed in editable mode
  • Check virtual environment is activated

Connection Issues:

  • Verify environment variables are set
  • Test connection with the direct connector test above

CLI Issues:

  • Check configuration file syntax
  • Verify namespace configuration matches your environment

Testing

This adapter includes comprehensive test coverage with multiple test types and execution modes.

Test Structure

tests/
├── unit/                     # Unit tests for individual components
├── integration/              # Integration tests with mocked dependencies
├── run_tests.py             # Main test runner with multiple modes
├── comprehensive_test.py     # Real connection testing script
└── conftest.py              # Shared test fixtures and configuration

Running Tests

Quick Start (from project root):

# Run all tests
python test.py

# Run specific test types
python test.py --mode unit          # Unit tests only (fastest)
python test.py --mode integration   # Integration tests only
python test.py --mode all          # All tests
python test.py --mode coverage     # Tests with coverage report

Advanced Usage (from tests/ directory):

cd tests

# Basic test execution
python run_tests.py --mode unit
python run_tests.py --mode integration -v

# Real connection testing (requires credentials)
python comprehensive_test.py

# Direct pytest usage
pytest unit/                    # Unit tests
pytest integration/             # Integration tests
pytest -k "test_config"        # Specific test patterns

Test Requirements

  • Unit Tests: No external dependencies, run with mocked components
  • Integration Tests: Mocked ClickZetta SDK, test connector logic
  • Real Connection Tests: Require actual ClickZetta credentials

Set environment variables for real connection testing:

export CLICKZETTA_SERVICE="your-service.clickzetta.com"
export CLICKZETTA_USERNAME="your-username"
export CLICKZETTA_PASSWORD="your-password"
export CLICKZETTA_INSTANCE="your-instance"
export CLICKZETTA_WORKSPACE="your-workspace"
export CLICKZETTA_SCHEMA="your-schema"
export CLICKZETTA_VCLUSTER="your-vcluster"

Test Coverage

  • ✅ Configuration validation and error handling
  • ✅ SQL query execution and result processing
  • ✅ Metadata discovery (tables, views, schemas)
  • ✅ Connection management and lifecycle
  • ✅ Volume operations and file listing
  • ✅ Error handling and exception cases

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Support

For issues and questions:

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datus_clickzetta-0.1.1.tar.gz (28.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

datus_clickzetta-0.1.1-py3-none-any.whl (15.3 kB view details)

Uploaded Python 3

File details

Details for the file datus_clickzetta-0.1.1.tar.gz.

File metadata

  • Download URL: datus_clickzetta-0.1.1.tar.gz
  • Upload date:
  • Size: 28.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: uv/0.7.4

File hashes

Hashes for datus_clickzetta-0.1.1.tar.gz
Algorithm Hash digest
SHA256 5d86ceda51c1b10ac42f426d63a6a09a92aa81fdb46016da86c0f0d8cc8b0600
MD5 bbf3c774388cdd832f00bf45aa61fd35
BLAKE2b-256 4a104d1278844fe9c5a7fb1ee938ff0ffca30498a1c727a20f5379f29699edfe

See more details on using hashes here.

File details

Details for the file datus_clickzetta-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for datus_clickzetta-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 657c34309566de76276799bdd6a0a8fc5cb9cf3ffb15c7ef1048ec65736c495a
MD5 e809ec5c3ba25de0154c0a2572c845ce
BLAKE2b-256 420b62c12188820d22eb5cbfe66631264f045b7b8285fe4409c9986b19303999

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page