Skip to main content

A standardized interface for data providers with sync and async support

Project description

Data Retrieval Module

A standardized interface for data providers with both synchronous and asynchronous support. This module provides abstract base classes that enable consistent data retrieval patterns across different data sources (APIs, databases, files, etc.).

Features

  • 🔄 Dual API Support: Both sync and async interfaces
  • 🏗️ Abstract Base Classes: Standardized patterns for data providers
  • 🔌 Connection Management: Built-in connection handling with context managers
  • 🔄 Retry Logic: Automatic retry with configurable parameters
  • 📊 Pagination Support: Standardized pagination with QueryResult
  • 🎣 Hook Methods: Customizable validation and transformation
  • 🧪 Type Safety: Full type hints and generic support
  • Well Tested: Comprehensive unit test coverage

Installation

Basic Installation

pip install data-retrieval-module

With Async Support

pip install data-retrieval-module[async]

Development Installation

pip install data-retrieval-module[dev]

All Features

pip install data-retrieval-module[all]

Quick Start

Synchronous Data Provider

from data_retrieval import DataProvider, QueryResult
from data_retrieval.model import ProviderStatus

class UserProvider(DataProvider[User]):
    def _connect(self) -> None:
        self._db = Database.connect(...)
    
    def _disconnect(self) -> None:
        self._db.close()
    
    def fetch(self, *args, **kwargs) -> QueryResult[User]:
        filters = kwargs.get("filters", {})
        users = self._db.users.find(filters)
        return QueryResult(
            data=users,
            total_count=len(users),
            metadata={"source": "database"}
        )

# Usage
provider = UserProvider()
with provider.connection(host="localhost", port=5432):
    result = provider.fetch(filters={"active": True})
    for user in result.data:
        print(user.name)

Asynchronous Data Provider

from data_retrieval import AsyncDataProvider

class AsyncUserProvider(AsyncDataProvider[User]):
    async def _connect(self) -> None:
        self._db = await Database.connect(...)
    
    async def _disconnect(self) -> None:
        await self._db.close()
    
    async def fetch(self, *args, **kwargs) -> QueryResult[User]:
        filters = kwargs.get("filters", {})
        users = await self._db.users.find(filters)
        return QueryResult(
            data=users,
            total_count=len(users),
            metadata={"source": "database"}
        )

# Usage
async def main():
    provider = AsyncUserProvider()
    async with provider.async_connection(host="localhost", port=5432) as p:
        result = await p.fetch(filters={"active": True})
        for user in result.data:
            print(user.name)

Core Classes

DataProvider (Synchronous)

Abstract base class for synchronous data providers.

Key Methods:

  • connect(**config) - Establish connection
  • disconnect() - Close connection
  • fetch(*args, **kwargs) - Retrieve data
  • fetch_or_raise(*args, **kwargs) - Fetch with error handling
  • with_retry(operation, max_retries, retry_delay) - Retry logic

Hook Methods:

  • validate(data) - Validate data
  • transform(data) - Transform data
  • health_check() - Health status

AsyncDataProvider (Asynchronous)

Abstract base class for asynchronous data providers.

Key Methods:

  • async connect(**config) - Establish connection
  • async disconnect() - Close connection
  • async fetch(*args, **kwargs) - Retrieve data
  • async fetch_or_raise(*args, **kwargs) - Fetch with error handling
  • async with_retry(operation, max_retries, retry_delay) - Retry logic

QueryResult

Standardized container for query results.

@dataclass
class QueryResult[T]:
    data: List[T]
    total_count: int
    metadata: Dict[str, Any]
    
    def is_empty(self) -> bool:
        return self.total_count == 0

Advanced Usage

Custom Validation

class ValidatedProvider(DataProvider[User]):
    def validate(self, data: User) -> bool:
        # Custom validation logic
        return data.email and "@" in data.email

Data Transformation

class TransformingProvider(DataProvider[User]):
    def transform(self, data: dict) -> User:
        # Convert raw data to User object
        return User(**data)

Retry Logic

provider = MyProvider()

# Retry with custom parameters
result = provider.with_retry(
    operation=lambda: provider.fetch(filters={"id": "123"}),
    max_retries=5,
    retry_delay=2.0,
    parameters={}
)

Context Managers

# Automatic connection management
with provider.connection(host="localhost") as p:
    data = p.fetch()

# Async version
async with provider.async_connection(host="localhost") as p:
    data = await p.fetch()

Error Handling

The module provides specific exception types:

from data_retrieval.model.exceptions import (
    DataProviderError,
    ConnectionError,
    QueryError,
    ValidationError
)

try:
    result = provider.fetch(filters={"invalid": "field"})
except ConnectionError as e:
    print(f"Connection failed: {e}")
except QueryError as e:
    print(f"Query failed: {e}")
except DataProviderError as e:
    print(f"General error: {e}")

Development

Setup Development Environment

# Clone repository
git clone https://github.com/AbigailWilliams1692/data-retrieval-module.git
cd data-retrieval-module

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e .[dev]

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=data_retrieval --cov-report=html

# Run specific test file
pytest tests/test_data_provider.py

Code Quality

# Format code
black data_retrieval/ tests/

# Sort imports
isort data_retrieval/ tests/

# Type checking
mypy data_retrieval/

# Linting
flake8 data_retrieval/ tests/

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

See CHANGELOG.md for a list of changes and version history.

Support

Related Projects


Made with ❤️ by AbigailWilliams1692

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_retrieval_module-1.0.0.tar.gz (14.5 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data_retrieval_module-1.0.0-py3-none-any.whl (10.5 kB view details)

Uploaded Python 3

File details

Details for the file data_retrieval_module-1.0.0.tar.gz.

File metadata

  • Download URL: data_retrieval_module-1.0.0.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for data_retrieval_module-1.0.0.tar.gz
Algorithm Hash digest
SHA256 09c73b8370fb47741a07dabe0bf5acbe0976c616b89fd1df95c69f3f6f992d06
MD5 e45b21f91f4b382d2acb19ebb46a3340
BLAKE2b-256 ce28d05ce9030dc2a7ce0e1db153135faec330155ce4e82e28379aeefa3196d8

See more details on using hashes here.

Provenance

The following attestation bundles were made for data_retrieval_module-1.0.0.tar.gz:

Publisher: publish.yml on AbigailWilliams1692/data_retrieval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file data_retrieval_module-1.0.0-py3-none-any.whl.

File metadata

File hashes

Hashes for data_retrieval_module-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6304e96dccae91207dd0b9eb59aeb92483bfd8d85bd5762e5accd506c44f7d0d
MD5 d0d7865278f96357850b77da58465dbc
BLAKE2b-256 b82e8068cc5ac1c9e176bdd193edf7ced4fc873c21ea6d4ec85ca6974c0bbed8

See more details on using hashes here.

Provenance

The following attestation bundles were made for data_retrieval_module-1.0.0-py3-none-any.whl:

Publisher: publish.yml on AbigailWilliams1692/data_retrieval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page