Skip to main content

The Dynamic Class Building and Zero-Boilerplate Universal Data Gateway.

Project description


🌌 Incorporator (v1.0.5)

The Dynamic Class Building and Zero-Boilerplate Universal Data Gateway.

PyPI version Python 3.9+ Pydantic V2 Typing: Strict Code style: Ruff License: MIT

Stop writing boilerplate models, manual HTTP connection loops, pagination state-trackers, and fragile data-cleaning lambda functions.

Incorporator is an elite Python framework that transforms raw JSON, CSV, and XML APIs into fully typed, relational Python Object Graphs in a single line of code. Trade away pages of unrelated code for an easy, prebuilt engine.

This is a framework that handles dynamic Pydantic metaprogramming, graph relational mapping, asynchronous connection pooling, and declarative ETL in less than 30KB.

🚀 Installation

pip install incorporator

⚡ The "Zero-Boilerplate" Philosophy

The Old Way: Define a rigid BaseModel, write an httpx loop, handle 429 retries, write a custom paginator, manually link foreign keys, catch KeyErrors, and hope the API schema doesn't change.

The Incorporator Way:

class Crypto(Incorporator): pass

async def crypto_coins():
    coins = await Crypto.incorp(
        inc_url="https://api.coingecko.com/api/v3/coins/markets?vs_currency=usd",
        inc_code="id",
        inc_name="name",
    )

    # Returns a dynamically compiled Pydantic list wrapper with an O(1) memory registry!
    bitcoin = coins.inc_dict['bitcoin']
    print(bitcoin.circulating_supply)       # 20021206.0
    print(bitcoin.current_price)            # 78321

asyncio.run(crypto_coins())

🛠️ The Core Architectural Pillars

1. The Holy Trinity API & Dynamic Registries

  • incorp(): Extracts raw data, compiles dynamic Pydantic schemas natively, and loads data into intelligent IncorporatorList wrappers.
  • refresh(): Hydrates existing instances seamlessly with new data (perfect for live feeds).
  • export(): Dumps stateful object graphs back into sanitized JSON, XML, or CSV files.
  • The inc_dict: Every object automatically registers itself into a memory-safe WeakValueDictionary. Look up any object instantly: coins.inc_dict.get('bitcoin').

2. Declarative ETL & Null-Safe Converters

Data is messy. Incorporator's built-in conv_dict tools intercept bad data before Pydantic validation, shielding you from crashes with beautiful, readable syntax.

  • inc(type): Automatically ranks fallbacks. inc(float) safely converts API garbage like "unknown" or "n/a" into 0.0.
  • calc(func, *keys): Multi-column row calculations. calc(len, 'residents', default=0).
  • link_to & link_to_list: Zero-boilerplate Graph Relational Mapping.

3. Native Concurrency & The State Carrier

Pass parent objects into inc_parent and declare an inc_child path. Incorporator caches the path state, drills into the nested objects, automatically spins up an asyncio.Semaphore, and batches concurrent deep-drills across a single shared httpx pool.

4. Advanced Asynchronous Pagination

Isolated OOP strategies gracefully handle pagination without infinite loops. Includes NextUrlPaginator, CursorPaginator, OffsetPaginator, PageNumberPaginator, and natively supports POST-body cursor overrides.


📖 Real-World Showcases

Showcase 1: Graph Relational Mapping (Star Wars API)

Turn disconnected flat APIs into deeply nested, traversable object graphs using split_and_get and link_to.

from incorporator.methods.converters import split_and_get, link_to, link_to_list

class Planet(Incorporator): pass
class Film(Incorporator): pass
class Person(Incorporator): pass

async def far_far_away():
    BASE_URL = "https://swapi.dev/api"
    
    # 0. Build a reusable, highly efficient ID extractor
    get_id = split_and_get('/', -1, int)
    
    # 1. Build the foundational Graph Nodes
    planets = await Planet.incorp(
        inc_url=f"{BASE_URL}/planets/", rec_path="results",
        inc_code="id", inc_name="name", inc_page=NextUrlPaginator("next"),
        conv_dict={"url": get_id}, name_chg=[("url", "id")]
    )

    films = await Film.incorp(
        inc_url=f"{BASE_URL}/films/", rec_path="results",
        inc_code="id", inc_name="title", inc_page=NextUrlPaginator("next"),
        conv_dict={"url": get_id}, name_chg=[("url", "id")]
    )

    # 2. Fetch People and map relations natively
    people = await Person.incorp(
        inc_url=f"{BASE_URL}/people/", rec_path="results",
        inc_code="id", inc_name="name", inc_page=NextUrlPaginator("next"),
        conv_dict={
            "url": get_id,
            "homeworld": link_to(planets, extractor=get_id),
            "films": link_to_list(films, extractor=get_id)
        },
        name_chg=[("url", "id")]
    )

    # Find Boba Fett at O(1) speed with graph mapping already built natively!
    boba_fett = people.inc_dict.get(22)
    print(boba_fett.homeworld.inc_name)  # "Kamino"
    print(boba_fett.films[0].inc_name)   # "The Empire Strikes Back"

asyncio.run(far_far_away())

Showcase 2: Explicit Parent-Based Enrichment (PokéAPI)

Pass shallow objects into inc_parent and explicitly declare inc_child to trigger automatic concurrent bulk scraping. No for loops required.

class Nav(Incorporator): pass
class Pokemon(Incorporator): pass

async def inc_pokedex():
    BASE_URL = "https://pokeapi.co/api/v2"

    # 1. SHALLOW DISCOVERY: Fetch 150 navigation objects.
    # We explicitly tell the framework that the next URLs live in the "url" key.
    pokemon_nav = await Nav.incorp(
        inc_url=f"{BASE_URL}/pokemon/?limit=50&offset=0",
        rec_path="results",
        inc_name="name",
        inc_child="url",  # <--- The State Carrier saves this path!
        inc_page=NextUrlPaginator("next"),
        call_lim=3
    )

    # 2. DEEP ENRICHMENT: The framework reads the cached state from Phase 1,
    # drills into the 150 objects, extracts the URLs, and fires 150 concurrent requests!
    enriched_pokemon = await Pokemon.incorp(
        inc_parent=pokemon_nav,
        inc_code="id",
        inc_name="name",
        excl_lst=["sprites", "moves", "game_indices", "held_items"]
    )

    # Deep objects are fully built.
    for pokemon in enriched_pokemon[:3]:
        print(pokemon.inc_name, pokemon.abilities[0].ability.name)

asyncio.run(inc_pokedex())

Showcase 3: XML Parsing to Live Bulk POSTs (NHTSA API)

Seamlessly bridge deep local XML data with live JSON REST APIs using Declarative POST tokens.

from incorporator.methods.converters import join_all

class JimmyInvoice(Incorporator): pass
class NHTSARecord(Incorporator): pass

async def audit_jimmys():
    # 1. Extract nested data from a local XML file into the Pydantic engine.
    # We set `inc_child` to cache the dot-notation path to the VINs.
    invoices = await JimmyInvoice.incorp(
        inc_file="shady_jimmy.xml",
        rec_path="Dealership.AuditFile.Invoices.Invoice",
        inc_child="Vehicle.VIN" 
    )

    # 2. Hit a live JSON Bulk Endpoint using a Declarative POST payload.
    # The `join_all` token automatically extracts the VINs and joins them 
    # into a single, highly-optimized Batch POST Request!
    live_records = await NHTSARecord.incorp(
        inc_url="https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVINValuesBatch/",
        inc_parent=invoices,
        method="POST",
        payload_type="form",
        form_payload={
            "format": "json", 
            "data": join_all(";") # <--- Zero-boilerplate batching!
        },
        rec_path="Results",
        inc_code="VIN"
    )

    # 3. Audit instantly via the memory-safe registry
    for inv in invoices:
        vin = inv.Vehicle.VIN
        actual_car = live_records.inc_dict.get(vin)
        if actual_car and actual_car.ModelYear != int(inv.Vehicle.Year):
            print("Fraud Detected!", inv.inc_code, inv.Vehicle.Model)

asyncio.run(audit_jimmys())

🕵️ Non-Blocking Observability

Need production logs without starving your async event loop?

from incorporator import LoggedIncorporator

class WebAPI(LoggedIncorporator): pass

# Configures background multithreaded queue logging automatically
instance = await WebAPI.incorp(
    inc_url="https://api.example.com/data",
    enable_logging=True
)

instance.log_info("Standard trace")
instance.log_error("API Offline", exc_info=True)
instance.log_api("Web traffic trace") # Routes to isolated api.log

🤝 Contributing

  1. Let's go!

Built for data engineers who want to sleep at night.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

incorporator-1.0.5.tar.gz (39.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

incorporator-1.0.5-py3-none-any.whl (31.1 kB view details)

Uploaded Python 3

File details

Details for the file incorporator-1.0.5.tar.gz.

File metadata

  • Download URL: incorporator-1.0.5.tar.gz
  • Upload date:
  • Size: 39.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for incorporator-1.0.5.tar.gz
Algorithm Hash digest
SHA256 72a26e4993791584ae6784ea4d2357aeb2630c4d40d99300b95a372d067eccdb
MD5 4919536fc6d8827a1e12d457dad83f3e
BLAKE2b-256 3a775ca13f8cdbb718a42c3eba3fef5b16b537d79f3eeb8558c3170eb629f9f1

See more details on using hashes here.

File details

Details for the file incorporator-1.0.5-py3-none-any.whl.

File metadata

  • Download URL: incorporator-1.0.5-py3-none-any.whl
  • Upload date:
  • Size: 31.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.2.0 CPython/3.13.3

File hashes

Hashes for incorporator-1.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 cb3f9e6517d08be00b31d21308933c00ea3c6f0fa688bd7a7ade574b033493fd
MD5 c43e210a4e97a7e03dbb6a4ac6146376
BLAKE2b-256 0a4f799c897caa33bb1b2ae9debb00348be2ed6a1d6c2bd16bce684fe444d131

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page