Skip to main content

Upgrade you Pandas ETL process.

Project description

Abraxos

PyPI version Documentation Status License: MIT

Abraxos is a lightweight Python toolkit for robust, row-aware data processing using Pandas and Pydantic. It helps you:

  • Read and clean messy CSVs
  • Transform data with fault-tolerant functions
  • Validate rows using Pydantic models
  • Load data into SQL databases with graceful error recovery

🚀 Features

  • 📄 CSV Ingestion with Bad Line Recovery
    Read CSVs in full or in chunks, and recover malformed lines separately.

  • 🔁 Transform DataFrames Resiliently
    Apply transformation functions and isolate rows that fail.

  • 🧪 Pydantic-Based Row Validation
    Validate each row using a Pydantic model, separating valid and invalid records.

  • 🛢️ SQL Insertion with Error Splitting
    Insert DataFrames into SQL databases with automatic retry and chunking logic.


📦 Installation

pip install abraxos

Abraxo requires Python 3.8+ and depends on: - pandas - numpy - optionally sqlalchemy for SQL I/O - your own pydantic models for validation


Documentation

Full documentation is available at:
https://abraxos.readthedocs.io


🧭 Usage Examples

🔍 Read CSVs with Error Recovery

from abraxos import read_csv

bad_lines, df = read_csv("data.csv")
print("Bad lines:", bad_lines)
print("Clean data:", df.head())
Example Output
Bad lines: [['', 'oops', 'bad', 'row']]
Clean data:
   id    name  age
0   1     Joe   28
1   2   Alice   35
2   3  Marcus   40

🧼 Transform DataFrames with Fault Isolation

from abraxos import transform

def clean_data(df):
    df["name"] = df["name"].str.strip().str.lower()
    return df

result = transform(df, clean_data)
print("Errors:", result.errors)
print("Success:", result.success_df)
Example Output
Errors: []
Success:
   id    name  age
0   1     joe   28
1   2   alice   35
2   3  marcus   40

✅ Validate Rows Using Pydantic

from abraxos import validate
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int

result = validate(df, Person())
print("Valid rows:", result.success_df)
print("Validation errors:", result.errors)
Example Output
Valid rows:
   name  age
0   Joe   28

Validation errors:
[
  ValidationError: 1 validation error for Person
  age
    value is not a valid integer (type=type_error.integer),

  ValidationError: 1 validation error for Person
  name
    none is not an allowed value (type=type_error.none.not_allowed)
]

🗃️ Insert Into SQL With Retry Logic

from abraxos import to_sql
from sqlalchemy import create_engine

engine = create_engine("sqlite:///example.db")
result = to_sql(df, "people", engine)

print("Successful inserts:", result.success_df.shape[0])
print("Failed rows:", result.errored_df)
Example Output
Successful inserts: 2
Failed rows:
   name  age
2  None   40

🧪 Test Coverage

Abraxo's internal structure is modular and testable. You can run tests via:

pytest tests/

📄 License

MIT License © 2024 Odos Matthews


🧙‍♂️ Author

Crafted by Odos Matthews to bring some magic to data workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

abraxos-0.0.7-py3-none-any.whl (20.2 kB view details)

Uploaded Python 3

File details

Details for the file abraxos-0.0.7-py3-none-any.whl.

File metadata

  • Download URL: abraxos-0.0.7-py3-none-any.whl
  • Upload date:
  • Size: 20.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for abraxos-0.0.7-py3-none-any.whl
Algorithm Hash digest
SHA256 7592e4007237467a103efab585ccd3015fdeee4dafb430039e312b2d1a38e534
MD5 5abd5adb5b36826b5d2c96ddaf3c6c50
BLAKE2b-256 34aae74f8e154405445fb84ae3a221847fe2174ed94222bd185cd52cfb3cbbe9

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page