Skip to main content

Python connector for AsterixDB

Project description

PyAsterix: Python Connector for Apache AsterixDB

PyAsterix is a feature-rich Python library designed for seamless interaction with Apache AsterixDB, a scalable NoSQL database management system. It offers two powerful interfaces: a low-level DB-API compliant interface and a high-level DataFrame API.

Table of Contents


Installation

Install from PyPI:

pip install pyasterix

With observability extras (Prometheus + OpenTelemetry):

pip install "pyasterix[observability]"

Install from source (editable for development):

git clone https://github.com/your-org/pyasterix.git
cd pyasterix
python -m venv .venv
".venv"/Scripts/activate
pip install -U pip build twine
pip install -e .

Features

Core Features

  • PEP 249 compliant database interface
  • Pandas-like DataFrame API
  • Support for both synchronous and asynchronous queries
  • Comprehensive error handling with custom, context-rich exceptions
  • Connection pooling and intelligent connection management
  • Native support for AsterixDB data types
  • Easy integration with pandas ecosystem
  • Built-in observability: metrics (Prometheus), tracing (OpenTelemetry), structured logging

DB-API Features

  • Standard cursor interface
  • Transaction support (where applicable)
  • Parameterized queries
  • Multiple result fetch methods

Advanced Features

  • Observability (metrics, tracing, logging) with production-ready configuration
  • Async query support (status/result handles, pooled polling)
  • Connection pool lifecycle management (validation, idle expiry, cleanup thread)
  • Error mapping from HTTP/AsterixDB payloads to precise exceptions
  • DataFrame API Features
  • Intuitive query building
  • Method chaining
  • Complex aggregations
  • Join operations
  • Filtering and sorting
  • Group by operations
  • Direct pandas DataFrame conversion

Architecture

Components

Connection Management

  • Connection pooling and lifecycle (validation, idle/lifetime expiry, background cleanup)
  • Session handling via HTTP sessions
  • Query execution including async/deferred modes

Query Building

  • SQL++ query generation
  • Parameter binding
  • Query validation

Result Processing

  • Type conversion
  • Result caching
  • Data streaming

Observability

  • Metrics: query durations, counts, rows fetched, pool gauges, error counters
  • Tracing: spans for execute/fetch/async/pool and DataFrame operations (OTel compatible)
  • Logging: structured JSON with trace correlation and performance-aware filtering

Exception Handling

  • PEP 249 standard hierarchy + AsterixDB-specific exceptions (HTTPError, NetworkError, TimeoutError, SyntaxError, IdentifierError, AsyncQueryError, PoolExhaustedError, etc.)
  • Rich error context attached to each exception and .to_dict() serialization

Best Practices

Connection Management

  • Use context managers (with statements)
  • Close connections explicitly
  • Implement connection pooling for web applications and batch services

Query Optimization

  • Use appropriate indexes
  • Leverage query hints when necessary
  • Monitor query performance

Error Handling

  • Implement proper exception handling
  • Use retry mechanisms for transient failures
  • Log errors appropriately
  • Prefer catching specific driver exceptions (e.g., SyntaxError, NetworkError, TimeoutError)
  • Inspect .context on exceptions and leverage .to_dict() for structured logging

Observability

  • Enable metrics and tracing in non-prod first; tune sampling in prod
  • Export Prometheus metrics and view traces via OTLP/Jaeger
  • Use structured logs with correlation IDs for cross-service debugging

Documentation

  • Driver Overview: docs/DRIVER_OVERVIEW.md
  • DB-API Guide: docs/DBAPI_GUIDE.md
  • DataFrame Guide: docs/DATAFRAME_GUIDE.md
  • Observability for Developers: docs/OBSERVABILITY_FOR_DEVELOPERS.md
  • Exception Handling: docs/EXCEPTION_HANDLING.md

Contributing

  • We welcome contributions! Please follow these steps:
    1. Fork the repository
    2. Create a feature branch
    3. Commit your changes
    4. Create a pull request

License

  • This project is licensed under the MIT License - see the LICENSE file for details.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyasterix-0.1.3.tar.gz (62.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

pyasterix-0.1.3-py3-none-any.whl (49.9 kB view details)

Uploaded Python 3

File details

Details for the file pyasterix-0.1.3.tar.gz.

File metadata

  • Download URL: pyasterix-0.1.3.tar.gz
  • Upload date:
  • Size: 62.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for pyasterix-0.1.3.tar.gz
Algorithm Hash digest
SHA256 1b7100fa053b94eb38efb64e96c1a8951594030cde9a0339b790ab80830eb14d
MD5 42e6b2bdd72546f5b069925180f94119
BLAKE2b-256 61c9cc26fd21f5ad32b6daf5df03890bc8ad435eb685abdafd3b2925631f45c4

See more details on using hashes here.

File details

Details for the file pyasterix-0.1.3-py3-none-any.whl.

File metadata

  • Download URL: pyasterix-0.1.3-py3-none-any.whl
  • Upload date:
  • Size: 49.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.12.3

File hashes

Hashes for pyasterix-0.1.3-py3-none-any.whl
Algorithm Hash digest
SHA256 6d72b987d921f96ecc5090fa019a1e557cf3a14e740a24abcb1b21b3119782c0
MD5 05bb5a2e124379405b92daf4c2edb630
BLAKE2b-256 b4533332154520230de74b74c0b0f59bc7196316545e09694e96cdd94cdcd3e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page