Skip to main content

A standardized interface for data providers with sync and async support

Project description

Data Retrieval Module

A standardized interface for data providers with both synchronous and asynchronous support. This module provides abstract base classes that enable consistent data retrieval patterns across different data sources (APIs, databases, files, etc.).

Features

  • 🔄 Dual API Support: Both sync and async interfaces
  • 🏗️ Abstract Base Classes: Standardized patterns for data providers
  • 🔌 Connection Management: Built-in connection handling with context managers
  • 🔄 Retry Logic: Automatic retry with configurable parameters
  • 📊 Pagination Support: Standardized pagination with QueryResult
  • 🎣 Hook Methods: Customizable validation and transformation
  • 🧪 Type Safety: Full type hints and generic support
  • Well Tested: Comprehensive unit test coverage

Installation

Basic Installation

pip install data-retrieval-module

With Async Support

pip install data-retrieval-module[async]

Development Installation

pip install data-retrieval-module[dev]

All Features

pip install data-retrieval-module[all]

Quick Start

Synchronous Data Provider

from data_retrieval import DataProvider, QueryResult
from data_retrieval.model import ProviderStatus

class UserProvider(DataProvider[User]):
    def _connect(self) -> None:
        self._db = Database.connect(...)
    
    def _disconnect(self) -> None:
        self._db.close()
    
    def fetch(self, *args, **kwargs) -> QueryResult[User]:
        filters = kwargs.get("filters", {})
        users = self._db.users.find(filters)
        return QueryResult(
            data=users,
            total_count=len(users),
            metadata={"source": "database"}
        )

# Usage
provider = UserProvider()
with provider.connection(host="localhost", port=5432):
    result = provider.fetch(filters={"active": True})
    for user in result.data:
        print(user.name)

Asynchronous Data Provider

from data_retrieval import AsyncDataProvider

class AsyncUserProvider(AsyncDataProvider[User]):
    async def _connect(self) -> None:
        self._db = await Database.connect(...)
    
    async def _disconnect(self) -> None:
        await self._db.close()
    
    async def fetch(self, *args, **kwargs) -> QueryResult[User]:
        filters = kwargs.get("filters", {})
        users = await self._db.users.find(filters)
        return QueryResult(
            data=users,
            total_count=len(users),
            metadata={"source": "database"}
        )

# Usage
async def main():
    provider = AsyncUserProvider()
    async with provider.async_connection(host="localhost", port=5432) as p:
        result = await p.fetch(filters={"active": True})
        for user in result.data:
            print(user.name)

Core Classes

DataProvider (Synchronous)

Abstract base class for synchronous data providers.

Key Methods:

  • connect(**config) - Establish connection
  • disconnect() - Close connection
  • fetch(*args, **kwargs) - Retrieve data
  • fetch_or_raise(*args, **kwargs) - Fetch with error handling
  • with_retry(operation, max_retries, retry_delay) - Retry logic

Hook Methods:

  • validate(data) - Validate data
  • transform(data) - Transform data
  • health_check() - Health status

AsyncDataProvider (Asynchronous)

Abstract base class for asynchronous data providers.

Key Methods:

  • async connect(**config) - Establish connection
  • async disconnect() - Close connection
  • async fetch(*args, **kwargs) - Retrieve data
  • async fetch_or_raise(*args, **kwargs) - Fetch with error handling
  • async with_retry(operation, max_retries, retry_delay) - Retry logic

QueryResult

Standardized container for query results.

@dataclass
class QueryResult[T]:
    data: List[T]
    total_count: int
    metadata: Dict[str, Any]
    
    def is_empty(self) -> bool:
        return self.total_count == 0

Advanced Usage

Custom Validation

class ValidatedProvider(DataProvider[User]):
    def validate(self, data: User) -> bool:
        # Custom validation logic
        return data.email and "@" in data.email

Data Transformation

class TransformingProvider(DataProvider[User]):
    def transform(self, data: dict) -> User:
        # Convert raw data to User object
        return User(**data)

Retry Logic

provider = MyProvider()

# Retry with custom parameters
result = provider.with_retry(
    operation=lambda: provider.fetch(filters={"id": "123"}),
    max_retries=5,
    retry_delay=2.0,
    parameters={}
)

Context Managers

# Automatic connection management
with provider.connection(host="localhost") as p:
    data = p.fetch()

# Async version
async with provider.async_connection(host="localhost") as p:
    data = await p.fetch()

Error Handling

The module provides specific exception types:

from data_retrieval.model.exceptions import (
    DataProviderError,
    ConnectionError,
    QueryError,
    ValidationError
)

try:
    result = provider.fetch(filters={"invalid": "field"})
except ConnectionError as e:
    print(f"Connection failed: {e}")
except QueryError as e:
    print(f"Query failed: {e}")
except DataProviderError as e:
    print(f"General error: {e}")

Development

Setup Development Environment

# Clone repository
git clone https://github.com/AbigailWilliams1692/data-retrieval-module.git
cd data-retrieval-module

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install development dependencies
pip install -e .[dev]

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=data_retrieval --cov-report=html

# Run specific test file
pytest tests/test_data_provider.py

Code Quality

# Format code
black data_retrieval/ tests/

# Sort imports
isort data_retrieval/ tests/

# Type checking
mypy data_retrieval/

# Linting
flake8 data_retrieval/ tests/

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Changelog

See CHANGELOG.md for a list of changes and version history.

Support

Related Projects


Made with ❤️ by AbigailWilliams1692

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data_retrieval_module-1.0.2.tar.gz (24.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

data_retrieval_module-1.0.2-py3-none-any.whl (28.3 kB view details)

Uploaded Python 3

File details

Details for the file data_retrieval_module-1.0.2.tar.gz.

File metadata

  • Download URL: data_retrieval_module-1.0.2.tar.gz
  • Upload date:
  • Size: 24.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for data_retrieval_module-1.0.2.tar.gz
Algorithm Hash digest
SHA256 4a0eeb9fb00f713509fb6e6587b1f986205f75a3579bece5cee58ca250105e03
MD5 97fbd8a001cb178d5844c9e853af5a6d
BLAKE2b-256 748716a325ffe2e2f3a55152a20dfdae83f7af55ccf4df836a2c58e33512610c

See more details on using hashes here.

Provenance

The following attestation bundles were made for data_retrieval_module-1.0.2.tar.gz:

Publisher: publish.yml on AbigailWilliams1692/data_retrieval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file data_retrieval_module-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for data_retrieval_module-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 63511407e58f4ef53581232ea31f555c778ead4abd7019f53e0a701dd97ec645
MD5 f563b3c1dd934cf0e78e5a231c395b7f
BLAKE2b-256 e144842dc0bd4050abe14d319f7b5b67f041ef064d4ef90cd73822c8bdee7a59

See more details on using hashes here.

Provenance

The following attestation bundles were made for data_retrieval_module-1.0.2-py3-none-any.whl:

Publisher: publish.yml on AbigailWilliams1692/data_retrieval

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page