Skip to main content

Data Connector Python SDK

Project description

Data Connector Python SDK

A comprehensive Python SDK for building robust data connectors with standardized error handling and graceful failure management.

PyPI version Python versions License: MIT

Table of Contents

Installation

Install from PyPI

pip install dc-python-sdk

Install from Source

git clone https://github.com/data-connector/dc-python-sdk.git
cd dc-python-sdk
pip install -e .

Requirements

  • Python >= 3.6
  • setuptools >= 42

Quick Start

Using the error classes

from dc_sdk import errors

# Example: Handling authentication failure
try:
    # Your authentication logic here
    authenticate_user(credentials)
except Exception as e:
    raise errors.AuthenticationError("Invalid credentials provided. Please check your username and password.")

# Example: Handling missing objects
def get_available_objects():
    objects = fetch_objects_from_api()
    if not objects:
        raise errors.NoObjectsFoundError("No tables or objects found for this account. Please ensure your account has accessible data.")
    return objects

Running the local HTTP server

Install the SDK and start the HTTP server that wraps your connector:

pip install dc-python-sdk

# starts a FastAPI server on port 8000
dc-sdk http

Then you can POST to the /invoke endpoint:

curl -X POST http://localhost:8000/invoke \
  -H "Content-Type: application/json" \
  -d '{
    "method": "get_objects",
    "credentials": { "api_key": "YOUR_KEY" },
    "params": {}
  }'

Using the AWS Lambda handler

The SDK also exposes a Lambda-style handler in dc_sdk.handler:

from dc_sdk.handler import handler

def lambda_handler(event, context):
    return handler(event, context)

Error Handling

Reference: Error Handling Documentation

Error handling is one of the most important ways we can provide users with clear, informative feedback—so they're not left wondering what some random "Error 500" means (because nothing says "fun" like debugging a vague server error at 2 AM).

Our system uses the dc_sdk library to gracefully throw errors. This ensures:

  • Users see helpful messages instead of cryptic stack traces
  • Unhandled errors escalate as server errors, which automatically generate a bug ticket for the engineering team
  • Errors can be tested, logged, and surfaced consistently across all connectors

Error Categories

The SDK provides 21 different error classes organized into logical categories. Each error is designed to handle specific failure scenarios:

Authentication Errors

  • AuthenticationError - Invalid credentials, expired tokens
  • WhitelistError - IP not whitelisted, connection refused

Object & Field Errors

  • NoObjectsFoundError - No tables/objects available
  • GetObjectsError - Failed to retrieve objects
  • NoFieldsFoundError - Object exists but has no fields
  • GetFieldsError - Cannot retrieve field information
  • BadFieldIDError - Invalid field ID for object
  • BadObjectIDError - Object ID doesn't exist

Data Filtering & Mapping Errors

  • FilterDataTypeError - Invalid data type for filtering
  • FieldDataTypeError - Unsupported field data type
  • MappingError - Data mapping failures

Data Retrieval Errors

  • DataError - Generic data retrieval failure
  • APIRequestError - API returned error status
  • APITimeoutError - Request timeout
  • APIPermissionError - Insufficient API permissions
  • NoRowsFoundError - Query successful but no data

Data Loading Errors

  • LoadDataError - Failed to load data to destination
  • NotADestinationError - Connector is read-only
  • UpdateMethodNotSupportedError - Invalid update method

Implementation Errors

  • NotImplementedError - Required method not implemented

For detailed examples and usage patterns, see the Examples section below.

Best Practices

  1. Always raise the most specific error possible (don't just raise Error)

  2. Add a helpful message—this is surfaced directly to the user

  3. Differentiate between client-facing and internal errors:

    • Client-facing errors (e.g., AuthenticationError, WhitelistError) provide actionable guidance
    • Internal errors (e.g., GetObjectsError) indicate issues within the connector or API
  4. Use descriptive error messages that help users understand what went wrong and how to fix it

  5. Include context in error messages when possible (e.g., which field, table, or operation failed)

Examples

Basic Usage

from dc_sdk import errors

# Authentication example
def authenticate_user(api_key):
    if not api_key:
        raise errors.AuthenticationError("API key is required")
    
    if not validate_api_key(api_key):
        raise errors.AuthenticationError("Invalid API key. Please check your credentials.")

# Object validation example
def get_table_data(table_name):
    if table_name not in available_tables:
        available = ", ".join(available_tables.keys())
        raise errors.BadObjectIDError(f"Table '{table_name}' not found. Available: {available}")
    
    try:
        return fetch_table_data(table_name)
    except Exception as e:
        raise errors.DataError(f"Failed to retrieve data from '{table_name}': {str(e)}")

Advanced Error Handling

from dc_sdk import errors

class DataConnector:
    def sync_data(self, source_table, destination_table, update_method="append"):
        # Validate update method
        supported_methods = ["append", "replace"]
        if update_method not in supported_methods:
            raise errors.UpdateMethodNotSupportedError(
                f"Update method '{update_method}' not supported. Use: {', '.join(supported_methods)}"
            )
        
        # Check if connector supports destinations
        if not self.is_destination:
            raise errors.NotADestinationError("This connector is read-only and cannot receive data")
        
        try:
            # Perform the sync
            result = self.perform_sync(source_table, destination_table, update_method)
            if not result.success:
                raise errors.LoadDataError(f"Sync failed: {result.error_message}")
        except TimeoutError:
            raise errors.APITimeoutError("Sync operation timed out. Please try again with smaller batches.")
        except PermissionError:
            raise errors.APIPermissionError("Insufficient permissions to write to destination table")

Development

Setting up Development Environment

# Clone the repository
git clone https://github.com/data-connector/dc-python-sdk.git
cd dc-python-sdk

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e .
pip install pytest pytest-cov

# Run tests
pytest

Version Management and Release Process

Semantic Versioning

This project follows Semantic Versioning (SemVer):

  • MAJOR version (X.y.z): Breaking changes that are not backward compatible
  • MINOR version (x.Y.z): New features that are backward compatible
  • PATCH version (x.y.Z): Bug fixes and minor improvements that are backward compatible

Examples:

  • 1.4.31.4.4: Bug fixes or minor improvements
  • 1.4.31.5.0: New features added (backward compatible)
  • 1.4.32.0.0: Breaking changes (not backward compatible)

Current version: 1.5.0

1. Update Version Number

Edit the version in pyproject.toml:

[project]
name = "dc-python-sdk"
version = "1.5.0"  # bump this

2. Update Package Description (Optional)

While updating the version, you can also update the package description in pyproject.toml:

[project]
name = "dc-python-sdk"
version = "1.5.0"
description = "Data Connector Python SDK for building robust connectors with standardized error handling"

3. Commit Changes

# Stage your changes
git add pyproject.toml README.md

# Commit with a descriptive message
git commit -m "Bump version to 1.4.4 and update documentation"

# Push to main branch
git push origin main

4. Create GitHub Release

You have two options for creating a release:

Option A: Using GitHub Web Interface

  1. Go to your repository on GitHub
  2. Click on "Releases" in the right sidebar
  3. Click "Create a new release"
  4. Fill in the release details:
    • Tag version: v1.4.4 (must match your version in setup.cfg)
    • Release title: v1.4.4 - Description of changes
    • Description: Add release notes describing what changed
  5. Click "Publish release"

Option B: Using GitHub CLI

# Install GitHub CLI if not already installed
# Windows: winget install GitHub.cli
# macOS: brew install gh
# Linux: See https://cli.github.com/

# Create and push a tag
git tag v1.4.4
git push origin v1.4.4

# Create the release
gh release create v1.4.4 \
  --title "v1.4.4 - Enhanced Error Handling Documentation" \
  --notes "
  ## What's Changed
  - Updated comprehensive README with installation instructions
  - Added detailed error handling documentation
  - Improved code examples and best practices
  - Enhanced development setup instructions
  
  ## Installation
  \`\`\`bash
  pip install dc-python-sdk==1.4.4
  \`\`\`
  "

5. Automated Publishing

Once you create a GitHub release, the automated workflow (.github/workflows/publish.yml) will:

  1. ✅ Automatically build the package
  2. ✅ Run tests (if configured)
  3. ✅ Publish to PyPI using the stored PYPI_API_TOKEN

Monitor the workflow:

  • Go to the "Actions" tab in your GitHub repository
  • Watch the "Publish Python 🐍 package" workflow run
  • Verify successful publication to PyPI

6. Verify Release

# Check that the new version is available on PyPI
pip install dc-python-sdk==1.4.4

# Or upgrade to the latest version
pip install --upgrade dc-python-sdk

# Verify the version in Python
python -c "import dc_sdk; print('Version installed successfully')"

Quick Release Commands

For common release scenarios, here are the complete command sequences:

Patch Release (Bug fixes):

# 1. Update version in pyproject.toml (1.5.0 → 1.5.1)
# 2. Commit and release
git add pyproject.toml
git commit -m "Bump version to 1.5.1 - Bug fixes and documentation updates"
git push origin main
git tag v1.5.1
git push origin v1.5.1
gh release create v1.5.1 --title "v1.5.1 - Bug Fixes" --notes "Bug fixes and minor improvements"

Minor Release (New features):

# 1. Update version in pyproject.toml (1.5.1 → 1.6.0)
# 2. Commit and release
git add pyproject.toml
git commit -m "Bump version to 1.6.0 - New error handling features"
git push origin main
git tag v1.6.0
git push origin v1.6.0
gh release create v1.6.0 --title "v1.6.0 - New Features" --notes "Added new error classes and improved documentation"

Major Release (Breaking changes):

# 1. Update version in pyproject.toml (1.6.0 → 2.0.0)
# 2. Commit and release
git add pyproject.toml
git commit -m "Bump version to 2.0.0 - Breaking changes to error API"
git push origin main
git tag v2.0.0
git push origin v2.0.0
gh release create v2.0.0 --title "v2.0.0 - Major Release" --notes "⚠️ Breaking changes: Updated error class signatures"

Complete Release Checklist

  • Update version in pyproject.toml
  • Update any documentation or changelog
  • Test changes locally
  • Commit and push changes
  • Create GitHub release with proper tag (v1.4.4)
  • Monitor GitHub Actions workflow
  • Verify package is published to PyPI
  • Test installation of new version

Manual Building and Publishing (Alternative)

If you need to manually build and publish (not recommended for production):

# Build the package
python -m build

# Upload to PyPI (requires credentials)
python -m twine upload --repository pypi dist/*

# Upload to TestPyPI first (recommended for testing)
python -m twine upload --repository testpypi dist/*

Project Structure

dc-python-sdk/
├── src/
│   └── dc_sdk/
│       ├── __init__.py
│       ├── errors.py          # All error classes
│       ├── cli.py             # dc-sdk CLI entrypoint
│       ├── server.py          # FastAPI HTTP server for connectors
│       ├── handler.py         # AWS Lambda handler
│       ├── loader.py          # Connector loader utilities
│       ├── mapping.py         # Mapping abstraction used by handler
│       ├── session.py         # In-memory session management for HTTP server
│       ├── types.py           # Shared type definitions
│       └── test_connector.py  # Test utilities
├── setup.cfg                  # Legacy setuptools configuration
├── pyproject.toml             # Project & build configuration
├── README.md                 # This file
├── LICENSE                   # MIT License
└── .github/
    └── workflows/
        └── publish.yml       # Automated PyPI publishing

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Make your changes
  4. Add tests for your changes
  5. Ensure all tests pass (pytest)
  6. Commit your changes (git commit -m 'Add amazing feature')
  7. Push to the branch (git push origin feature/amazing-feature)
  8. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support


Summary

Use these errors to make the user's life easier (and to keep support tickets sane). The system is designed so that graceful errors = happy users, while unhandled errors = bug tickets.

Remember: Error handling isn't just about catching exceptions—it's about providing a great user experience even when things go wrong.

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

dc_python_sdk-1.5.21.tar.gz (41.1 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

dc_python_sdk-1.5.21-py3-none-any.whl (47.8 kB view details)

Uploaded Python 3

File details

Details for the file dc_python_sdk-1.5.21.tar.gz.

File metadata

  • Download URL: dc_python_sdk-1.5.21.tar.gz
  • Upload date:
  • Size: 41.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for dc_python_sdk-1.5.21.tar.gz
Algorithm Hash digest
SHA256 f4df010020f4740a68d319d24c1880a11ff80874b68bde9e9c11c4b1122d7a80
MD5 b143d7f6ea20410ed99d3d356cc92ed8
BLAKE2b-256 5ecf95f29ecbbb7bb0553882bb106aba6f1ce34873ad53a8e1ff08c0f1263a16

See more details on using hashes here.

File details

Details for the file dc_python_sdk-1.5.21-py3-none-any.whl.

File metadata

  • Download URL: dc_python_sdk-1.5.21-py3-none-any.whl
  • Upload date:
  • Size: 47.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.10.20

File hashes

Hashes for dc_python_sdk-1.5.21-py3-none-any.whl
Algorithm Hash digest
SHA256 db51d6a1a7be4f4589433128313a074dff75db7f29fec6f842b316648d0adcbd
MD5 ee8cde37d2d8c1ca7474ac69c9973544
BLAKE2b-256 b8f32deb4dcf9bf92a112b43020d7bbce7bdd494350160a4c9e3f2748b481056

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page