Skip to main content

A Python package for asynchronously enhancing tabular files via APIs.

Project description

Tabular-Enhancement-Tool

Documentation Status PyPI version codecov Code style: ruff

WARNING: this project is still in its early stages, and the code is written primarily by an AI coding agent. Please use with caution.

A Python package for asynchronously enhancing tabular files (CSV, Excel, TSV, TXT, Parquet) by calling external APIs for each row.

Why

In modern data lake architectures, raw tabular data (e.g., event logs, daily exports, customer records) often arrives in formats like CSV, Excel, or TSV. To make this data actionable, it frequently needs to be enriched with information residing in other systems—such as CRM details, geolocation data, or legacy internal services—accessible only via REST APIs.

The Tabular Enhancement Tool (tet) is designed to streamline this enrichment process:

  • Multi-source enhancement: Fetches data from external JSON-based REST APIs.
  • High Performance via Multi-threading: Instead of sequential processing, which can take hours for large files, this tool utilizes a thread pool to handle hundreds of rows concurrently.
  • Data Integrity and Precision: The tool instructs Pandas to treat all inputs as strings, ensuring that original data—like ZIP codes with leading zeros or numeric IDs—is retained exactly as it appeared in the source.
  • Append-Only Enhancement: Your original columns are never modified. The responses are appended as new columns, allowing you to preserve the lineage of the raw data while adding new value.
  • Response Flattening: By default, the tool expands API response objects into individual columns, making the data immediately available for analysis. For REST APIs, the tool automatically extracts the data field from the JSON response if present, focusing on the core payload. This behavior can be disabled if a single nested object is preferred.
  • Strict Order Preservation: Even with parallel execution, the output rows are guaranteed to match the order of the input file, making it safe for downstream processes that rely on stable indexing.
  • Flexible field mapping: Map DataFrame columns to API payload fields. Supports nested dictionaries and lists for complex JSON payloads.
  • HTTP GET and POST support: Choose the appropriate method for your API, with support for URL templating and query parameters.
  • REST API Authentication: Supports Basic Auth, Bearer Token, and API Key authentication schemes.

Installation

You can install the package directly from the source directory:

pip install tabular-enhancement-tool

This will automatically install the required dependencies (pandas, requests, openpyxl) and provide the tet command.

Usage

Read the docs. Documentation Status

License

Distributed under the MIT License. See LICENSE for more information.

Development and CI/CD

  • Linting & Formatting: Ruff is used to maintain high code quality and consistent style.
  • Documentation: Managed by Sphinx and hosted on Read the Docs. Documentation Status
  • Tested: via pytest and CodeCov. codecov

Credits

This tool was authored by Christopher Boyd and co-authored/developed by Junie, an autonomous programmer developed by JetBrains.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tabular_enhancement_tool-0.2.2.tar.gz (16.3 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

tabular_enhancement_tool-0.2.2-py3-none-any.whl (9.8 kB view details)

Uploaded Python 3

File details

Details for the file tabular_enhancement_tool-0.2.2.tar.gz.

File metadata

  • Download URL: tabular_enhancement_tool-0.2.2.tar.gz
  • Upload date:
  • Size: 16.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for tabular_enhancement_tool-0.2.2.tar.gz
Algorithm Hash digest
SHA256 d714791cd2bfd69d91718e4cc2e742dbcbebd7f4a1a2b43c6db4df33e04206e2
MD5 089b9c74451934aa8171d35c5f693ab4
BLAKE2b-256 639f0845eaed264ee66bc53bae14cd805792d97b4333dac5f8daf45d0b8e8a91

See more details on using hashes here.

Provenance

The following attestation bundles were made for tabular_enhancement_tool-0.2.2.tar.gz:

Publisher: publish.yml on Mikuana/tabular-enhancement-tool

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

File details

Details for the file tabular_enhancement_tool-0.2.2-py3-none-any.whl.

File metadata

File hashes

Hashes for tabular_enhancement_tool-0.2.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fa5dcbae1da6138626701711e31a796161f9df990ca72b0df8b74046983b0c8f
MD5 14c8972af46a26a8d2b2abb3f7e181a9
BLAKE2b-256 76011e8c98c60e46217376c51e4b6d48c1fcd0ad36515f2da46542701f1d2e86

See more details on using hashes here.

Provenance

The following attestation bundles were made for tabular_enhancement_tool-0.2.2-py3-none-any.whl:

Publisher: publish.yml on Mikuana/tabular-enhancement-tool

Attestations: Values shown here reflect the state when the release was signed and may no longer be current.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page