Skip to main content

The Dynamic Class Building and Zero-Boilerplate Universal Data Gateway.

Project description

# 🌌 Incorporator (v1.0.0)
**The Dynamic Class Building and Zero-Boilerplate Universal Data Gateway.**

[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![Pydantic V2](https://img.shields.io/badge/Pydantic-V2-e92063.svg)](https://docs.pydantic.dev/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Code style: Ruff](https://img.shields.io/badge/code%20style-Ruff-261230.svg)](https://github.com/astral-sh/ruff)
[![Typing: Strict](https://img.shields.io/badge/typing-strict-green.svg)](https://mypy.readthedocs.io/en/stable/)

Stop writing boilerplate models, manual HTTP connection loops, pagination state-trackers, and fragile data-cleaning lambda functions. 

**Incorporator** is an elite Python framework that transforms raw JSON, CSV, and XML APIs into fully typed, relational Python Object Graphs in a single line of code.

## 🚀 Installation

```bash
pip install incorporator

⚡ The "Zero-Boilerplate" Philosophy

The Old Way: Define a rigid BaseModel, write an httpx loop, handle 429 retries, write a custom paginator, manually link foreign keys, catch KeyErrors, and hope the API schema doesn't change.

The Incorporator Way:

from incorporator import Incorporator
from incorporator.methods.paginate import NextUrlPaginator

class Crypto(Incorporator): pass

# Fetch 150 coins, auto-paginate, generate Pydantic models on the fly, and rate-limit perfectly.
coins = await Crypto.incorp(
    inc_url="https://api.coingecko.com/api/v3/coins/markets?vs_currency=usd",
    inc_code="id",
    inc_name="name",
    inc_page=NextUrlPaginator("next"), 
    call_lim=3
)

print(coins[0].inc_name)       # "Bitcoin"
print(coins[0].current_price)  # 64000.00 (Dynamically typed as float by Pydantic V2!)

🛠️ The Core Architectural Pillars

1. The Holy Trinity API & Dynamic Registries

  • incorp(): Extracts raw data, compiles dynamic Pydantic schemas natively, and loads data into intelligent IncorporatorList wrappers.
  • refresh(): Hydrates existing instances seamlessly with new data (perfect for live feeds).
  • export(): Dumps stateful object graphs back into sanitized JSON, XML, or CSV files.
  • The inc_dict: Every object automatically registers itself into a memory-safe WeakValueDictionary. Look up any object instantly: coins.inc_dict.get('bitcoin').

2. Declarative ETL & Null-Safe Converters

Data is messy. Incorporator's built-in conv_dict tools intercept bad data before Pydantic validation, shielding you from crashes with beautiful, readable syntax.

  • inc(type): Automatically ranks fallbacks. inc(datetime) will parse ISO-8601 or 10+ standard string formats natively.
  • calc(func, *keys): Multi-column row calculations. calc(len, 'residents', default=0).
  • link_to & link_to_list: Zero-boilerplate Graph Relational Mapping.

3. Native Concurrency & Invisible Resilience

Pass a list of 500 URLs or trigger a deep-drill. Incorporator automatically spins up an asyncio.Semaphore, shares a single httpx.AsyncClient pool, and batches requests. Hit a 429 Too Many Requests? It automatically jitter-retries via tenacity. Still 429? It gracefully skips the failed row, logs it to results.failed_sources, and returns the remaining objects without crashing your pipeline.

4. Advanced Asynchronous Pagination

Isolated OOP strategies to gracefully handle pagination without infinite loops. Includes NextUrlPaginator, CursorPaginator, OffsetPaginator, PageNumberPaginator, and LinkHeaderPaginator.


📖 Real-World Showcases

Showcase 1: HATEOAS & Relational Mapping (Star Wars API)

Turn disconnected flat APIs into deeply nested, traversable object graphs using link_to and link_to_list.

from incorporator import Incorporator
from incorporator.methods.converters import calc, extract_url_id, flt, link_to, link_to_list

class Planet(Incorporator): pass
class Film(Incorporator): pass
class Person(Incorporator): pass

# 1. Build the foundational Graph Nodes
planets = await Planet.incorp(inc_url="https://swapi.dev/api/planets/", inc_code="url")
films = await Film.incorp(inc_url="https://swapi.dev/api/films/", inc_code="url")

# 2. Fetch People and map relations natively
people = await Person.incorp(
    inc_url="https://swapi.dev/api/people/", 
    inc_code="url",
    conv_dict={
        # Safely cast string numbers to floats
        "height": calc(float, default=0.0, target_type=flt),
        
        # Instantly link URL strings to our in-memory Planet and Film objects!
        "homeworld": calc(link_to(planets), default=None),
        "films": calc(link_to_list(films), default=[])
    }
)

# Deep Dot-Notation Navigation!
luke = people[0]
print(luke.homeworld.inc_name) # "Tatooine"
print(luke.films[0].inc_name)  # "A New Hope"

Showcase 2: Parent-Based Enrichment (PokéAPI)

Pass shallow objects into inc_parent to trigger automatic concurrent bulk detail scraping.

# 1. SHALLOW DISCOVERY: Fetch 150 navigation URLs
pokemon_nav = await Nav.incorp(
    inc_url="https://pokeapi.co/api/v2/pokemon/?limit=50&offset=0",
    inc_name="name", name_chg=[('url', 'detail_url')], 
    inc_page=NextUrlPaginator("next"), call_lim=3
)

def calculate_bst(stats: list) -> int:
    return sum(s.get("base_stat", 0) for s in stats if isinstance(s, dict))

# 2. DEEP ENRICHMENT: Pass the parent objects. The framework tears out 'detail_url', 
# fires 150 concurrent requests, and builds deep objects automatically.
enriched_pokemon = await Pokemon.incorp(
    inc_parent=pokemon_nav, 
    inc_code="id", inc_name="name",
    conv_dict={
        # Dynamically calculate Base Stat Total from the nested JSON array
        "stats": calc(calculate_bst, "stats", default=0, target_type=int)
    }
)

Showcase 3: Local XML to Live JSON Bulk POST (NHTSA API)

Seamlessly bridge deep local XML data with live JSON REST APIs.

# 1. Extract nested data from a local XML file
invoices = await JimmyInvoice.incorp(
    inc_file="shady_jimmy.xml",
    rec_path="Dealership.AuditFile.Invoices.Invoice"
)

vin_batch_string = ";".join([getattr(inv.Vehicle, "VIN", "") for inv in invoices])

# 2. Hit a live JSON Bulk Endpoint using a POST payload
live_records = await NHTSARecord.incorp(
    inc_url="https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVINValuesBatch/",
    method="POST",
    form_payload={"format": "json", "DATA": vin_batch_string},
    rec_path="Results",
    inc_code="VIN",
    conv_dict={ "ModelYear": inc(int) } # Force string years to integers
)

# 3. Audit instantly via the memory-safe registry
for inv in invoices:
    vin = inv.Vehicle.VIN
    actual_car = live_records.inc_dict.get(vin)
    if actual_car.ModelYear != int(inv.Vehicle.Year):
        print("Fraud Detected!")

🕵️ Non-Blocking Observability

Need production logs without starving your async event loop?

from incorporator import LoggedIncorporator

class WebAPI(LoggedIncorporator): pass

# Configures background multithreaded queue logging automatically
instance = await WebAPI.incorp(
    inc_url="https://api.example.com/data",
    enable_logging=True
)

instance.log_info("Standard trace")
instance.log_error("API Offline", exc_info=True)
instance.log_api("Web traffic trace") # Routes to isolated api.log

🤝 Contributing

  1. Clone the repo.
  2. pip install -e .[dev] (Installs pytest, mypy, ruff).
  3. Run tests: pytest tests/ -v.
  4. Check typing: mypy --strict incorporator.

Built for data engineers who want to sleep at night.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

incorporator-1.0.0.tar.gz (32.4 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

incorporator-1.0.0-py3-none-any.whl (26.5 kB view details)

Uploaded Python 3

File details

Details for the file incorporator-1.0.0.tar.gz.

File metadata

  • Download URL: incorporator-1.0.0.tar.gz
  • Upload date:
  • Size: 32.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for incorporator-1.0.0.tar.gz
Algorithm Hash digest
SHA256 024594e4bb0312fb05122b31ba1e83d6e343f0c7fb0195ba46f1617b3d9abcb1
MD5 089609237d0219ee1df2c62cffcde4f4
BLAKE2b-256 1d63f7bf43b4fa90632c9cbd5a745964cd26925bd72b50effcc92f0617652a9e

See more details on using hashes here.

File details

Details for the file incorporator-1.0.0-py3-none-any.whl.

File metadata

  • Download URL: incorporator-1.0.0-py3-none-any.whl
  • Upload date:
  • Size: 26.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for incorporator-1.0.0-py3-none-any.whl
Algorithm Hash digest
SHA256 e45e0f9ad83eb607305da130ec2ebe1403eb636143528e79ccb221643601e15c
MD5 aa77b0bb3bc1cd2889a51a38b0a3f7d7
BLAKE2b-256 a00d8db3c597b55e8d64bb929873f1891c021efeba1ea5eca96f4d043b93e36d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page