Skip to main content

Upgrade you Pandas ETL process.

Project description

Abraxos

Abraxos is a lightweight Python toolkit for robust, row-aware data processing using Pandas and Pydantic. It helps you:

  • Read and clean messy CSVs
  • Transform data with fault-tolerant functions
  • Validate rows using Pydantic models
  • Load data into SQL databases with graceful error recovery

🚀 Features

  • 📄 CSV Ingestion with Bad Line Recovery
    Read CSVs in full or in chunks, and recover malformed lines separately.

  • 🔁 Transform DataFrames Resiliently
    Apply transformation functions and isolate rows that fail.

  • 🧪 Pydantic-Based Row Validation
    Validate each row using a Pydantic model, separating valid and invalid records.

  • 🛢️ SQL Insertion with Error Splitting
    Insert DataFrames into SQL databases with automatic retry and chunking logic.


📦 Installation

pip install abraxos

Abraxo requires Python 3.8+ and depends on: - pandas - numpy - optionally sqlalchemy for SQL I/O - your own pydantic models for validation


🧭 Usage Examples

🔍 Read CSVs with Error Recovery

from abraxos import read_csv

bad_lines, df = read_csv("data.csv")
print("Bad lines:", bad_lines)
print("Clean data:", df.head())
Example Output
Bad lines: [['', 'oops', 'bad', 'row']]
Clean data:
   id    name  age
0   1     Joe   28
1   2   Alice   35
2   3  Marcus   40

🧼 Transform DataFrames with Fault Isolation

from abraxos import transform

def clean_data(df):
    df["name"] = df["name"].str.strip().str.lower()
    return df

result = transform(df, clean_data)
print("Errors:", result.errors)
print("Success:", result.success_df)
Example Output
Errors: []
Success:
   id    name  age
0   1     joe   28
1   2   alice   35
2   3  marcus   40

✅ Validate Rows Using Pydantic

from abraxos import validate
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int

result = validate(df, Person())
print("Valid rows:", result.success_df)
print("Validation errors:", result.errors)
Example Output
Valid rows:
   name  age
0   Joe   28

Validation errors:
[
  ValidationError: 1 validation error for Person
  age
    value is not a valid integer (type=type_error.integer),

  ValidationError: 1 validation error for Person
  name
    none is not an allowed value (type=type_error.none.not_allowed)
]

🗃️ Insert Into SQL With Retry Logic

from abraxos import to_sql
from sqlalchemy import create_engine

engine = create_engine("sqlite:///example.db")
result = to_sql(df, "people", engine)

print("Successful inserts:", result.success_df.shape[0])
print("Failed rows:", result.errored_df)
Example Output
Successful inserts: 2
Failed rows:
   name  age
2  None   40

🧪 Test Coverage

Abraxo's internal structure is modular and testable. You can run tests via:

pytest tests/

📄 License

MIT License © 2024 Odos Matthews


🧙‍♂️ Author

Crafted by Odos Matthews to bring some magic to data workflows.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

abraxos-0.0.6-py3-none-any.whl (10.0 kB view details)

Uploaded Python 3

File details

Details for the file abraxos-0.0.6-py3-none-any.whl.

File metadata

  • Download URL: abraxos-0.0.6-py3-none-any.whl
  • Upload date:
  • Size: 10.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.9.6

File hashes

Hashes for abraxos-0.0.6-py3-none-any.whl
Algorithm Hash digest
SHA256 d35b332846000896eaf91498b099697a2233d41656c4211a70fbe8953a5ded45
MD5 e12696f5df45f65ef70b90b8097e5914
BLAKE2b-256 91ff452aa15d28bfb81b9f2449a4b25c42a0fc2fb05f25feea7ab26d007c1573

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page