Skip to main content

A standardized interface for data providers with sync and async support

Project description

Data Retrieval Module

A standardized interface for data providers with both synchronous and asynchronous support. This module provides abstract base classes that enable consistent data retrieval patterns across different data sources (APIs, databases, files, etc.).

Features

  • 🔄 Dual API Support: Both sync and async interfaces
  • 🏗️ Abstract Base Classes: Standardized patterns for data providers
  • 🔌 Connection Management: Built-in connection handling with context managers
  • 🔄 Retry Logic: Automatic retry with configurable parameters
  • 📊 Pagination Support: Standardized pagination with QueryResult
  • 🎣 Hook Methods: Customizable validation and transformation
  • 🧪 Type Safety: Full type hints and generic support
  • Well Tested: Comprehensive unit test coverage

Installation

Basic Installation

pip install data-retrieval-module

With Async Support

pip install data-retrieval-module[async]

Development Installation

pip install data-retrieval-module[dev]

All Features

pip install data-retrieval-module[all]

Quick Start

Synchronous Data Provider

from data_retrieval import DataProvider, QueryResult
from data_retrieval.model import ProviderStatus

class UserProvider(DataProvider[User]):
    def _connect(self) -> None:
        self._db = Database.connect(...)
    
    def _disconnect(self) -> None:
        self._db.close()
    
    def fetch(self, *args, **kwargs) -> QueryResult[User]:
        filters = kwargs.get("filters", {})
        users = self._db.users.find(filters)
        return QueryResult(
            data=users,
            total_count=len(users),
            metadata={"source": "database"}
        )

# Usage
provider = UserProvider()
with provider.connection(host="localhost", port=5432):
    result = provider.fetch(filters={"active": True})
    for user in result.data:
        print(user.name)

Asynchronous Data Provider

from data_retrieval import AsyncDataProvider

class AsyncUserProvider(AsyncDataProvider[User]):
    async def _connect(self) -> None:
        self._db = await Database.connect(...)
    
    async def _disconnect(self) -> None:
        await self._db.close()
    
    async def fetch(self, *args, **kwargs) -> QueryResult[User]:
        filters = kwargs.get("filters", {})
        users = await self._db.users.find(filters)
        return QueryResult(
            data=users,
            total_count=len(users),
            metadata={"source": "database"}
        )

# Usage
async def main():
    provider = AsyncUserProvider()
    async with provider.async_connection(host="localhost", port=5432) as p:
        result = await p.fetch(filters={"active": True})
        for user in result.data:
            print(user.name)

Core Classes

DataProvider (Synchronous)

Abstract base class for synchronous data providers.

Key Methods:

  • connect(**config) - Establish connection
  • disconnect() - Close connection
  • fetch(*args, **kwargs) - Retrieve data
  • fetch_or_raise(*args, **kwargs) - Fetch with error handling
  • with_retry(operation, max_retries, retry_delay) - Retry logic

Hook Methods:

  • validate(data) - Validate data
  • transform(data) - Transform data
  • health_check() - Health status

AsyncDataProvider (Asynchronous)

Abstract base class for asynchronous data providers.

Key Methods:

  • async connect(**config) - Establish connection
  • async disconnect() - Close connection
  • async fetch(*args, **kwargs) - Retrieve data
  • async fetch_or_raise(*args, **kwargs) - Fetch with error handling
  • async with_retry(operation, max_retries, retry_delay) - Retry logic

QueryResult

Standardized container for query results.

@dataclass
class QueryResult[T]:
    data: List[T]
    total_count: int
    metadata: Dict[str, Any]
    
    def is_empty(self) -> bool:
        return self.total_count == 0

Advanced Usage

Custom Validation

class ValidatedProvider(DataProvider[User]):
    def validate(self, data: User) -> bool:
        # Custom validation logic
        return data.email and "@" in data.email

Data Transformation

class TransformingProvider(DataProvider[User]):
    def transform(self, data: dict) -> User:
        # Convert raw data to User object
        return User(**data)

Retry Logic

provider = MyProvider()

# Retry with custom parameters
result = provider.with_retry(
    operation=lambda: provider.fetch(filters={"id": "123"}),
    max_retries=5,
    retry_delay=2.0,
    parameters={}
)

Context Managers

# Automatic connection management
with provider.connection(host="localhost") as p:
    data = p.fetch()

# Async version
async with provider.async_connection(host="localhost") as p:
    data = await p.fetch()

Error Handling

The module provides specific exception types:

from data_retrieval.model.exceptions import (
    DataProviderError,
    ConnectionError,
    QueryError,
    ValidationError
)

try:
    result = provider.fetch(filters={"invalid": "field"})
except ConnectionError as e:
    print(f"Connection failed: {e}")
except QueryError as e:
    print(f"Query failed: {e}")
except DataProviderError as e:
    print(f"General error: {e}")

Development

Setup Development Environment

# Clone repository
git clone https://github.com/AbigailWilliams1692/data-retrieval-module.git
cd data-retrieval-module

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e .[dev]

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=data_retrieval --cov-report=html

# Run specific test file
pytest tests/test_data_provider.py

Code Quality

# Format code
black data_retrieval/ tests/

# Sort imports
isort data_retrieval/ tests/

# Type checking
mypy data_retrieval/

# Linting
flake8 data_retrieval/ tests/

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

See CHANGELOG.md for a list of changes and version history.

Support

Related Projects


Made with ❤️ by AbigailWilliams1692

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_retrieval_module-1.0.1.tar.gz (19.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data_retrieval_module-1.0.1-py3-none-any.whl (19.6 kB view details)

Uploaded Python 3

File details

Details for the file data_retrieval_module-1.0.1.tar.gz.

File metadata

  • Download URL: data_retrieval_module-1.0.1.tar.gz
  • Upload date:
  • Size: 19.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for data_retrieval_module-1.0.1.tar.gz
Algorithm Hash digest
SHA256 5111b972193eb11f3ad7138e0c38bc860e3ae2aa68fdfb48ad679bd80bebf8d2
MD5 76772e831a58cac895f3bc2f2bcf746a
BLAKE2b-256 7ffb78f44303769e0d9028ac843438d3a1c8c07fb25bb57adfe4fc86f1a49e99

See more details on using hashes here.

Provenance

The following attestation bundles were made for data_retrieval_module-1.0.1.tar.gz:

Publisher: publish.yml on AbigailWilliams1692/data_retrieval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file data_retrieval_module-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for data_retrieval_module-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 6916a8b587fb4bdb21c2c134d89f23bd8323510b3ab8f68080cdedc8bb604a3a
MD5 4963f55edc82833f437c8511e9572651
BLAKE2b-256 6b3bd841a87476769ac629817e3bc939a7c9870e2009d96c82f4ffdae93b8059

See more details on using hashes here.

Provenance

The following attestation bundles were made for data_retrieval_module-1.0.1-py3-none-any.whl:

Publisher: publish.yml on AbigailWilliams1692/data_retrieval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page