Skip to main content

Efficiently parse and validate all columns in pandas DataFrame.

Project description

Vrame

Vrame is a Python library designed to efficiently parse and validate all columns in pandas DataFrame. It leverages the vectorized operations of pandas to significantly speed up the validation process, making it a powerful tool for data validation in data science and machine learning projects.

Features

  • Vectorized Validation: Utilizes pandas' vectorized operations for fast and efficient data validation.
  • Similar Syntax to Pydantic: Offers a familiar API for those who have used Pydantic, making it easy to adopt.
  • Custom Validators: Allows for the definition of custom validation rules to meet specific data requirements.

Installation

To install Vrame, use pip:

pip install vrame

Usage

First, define a schema for your DataFrame using Vrame's syntax, which is similar to Pydantic's:

import pandas as pd
import numpy as np
from vrame.basemodel import BaseModel
from vrame.column_types import (
    Integer,
    Float,
    Boolean,
    Datetime,
    String,
    List,
    Tuple,
    Dictionary,
    Set,
    Object
)


class Model(BaseModel):
    integer = Integer(lower=-1, upper=6, nullable=True)
    float = Float(lower=-1.0, upper=6.0, nullable=True)
    bool = Boolean(nullable=True)
    datetime = Datetime(lower="2024-03-20", upper="2024-03-21", nullable=True)
    list = List(nullable=True, min_items=1, max_items=3)
    tuple = Tuple(nullable=True, min_items=1, max_items=3)
    dictionary = Dictionary(nullable=True, min_items=1, max_items=3)
    set = Set(nullable=True, min_items=1, max_items=3)
    string = String(min_length=0, max_length=5, nullable=True)
    object = Object(nullable=True)


if __name__ == "__main__":
    df = pd.DataFrame(
        {
            'integer': [1, "2", 3, 4, np.nan],
            'float': [1.0, "2.0", 3, "4", "5"],
            'bool': [True, "False", "True", False, False],
            'datetime': [
                "2024-03-20",
                "2024-03-21",
                "2024-03-21",
                "2024-03-21",
                "2024-03-21"
            ],
            'list': [[1, 2], "[3, 4]", [5, 6], [7, 8], [9, 10]],
            'tuple': [(1, 2), "(3, 4)", (5, 6), (7, 8), (9, 10)],
            'dictionary': [
                {'a': 1, 'b': 2.1},
                "{'e': 3, 'f': 4.0}",
                {'a': 1, 'b': 2.1},
                {'a': 1, 'b': 2.1},
                {'a': 1, 'b': 2.1}
            ],
            'set': [{1, 2}, "{1, 2}", {1, 2}, {1, 2}, {1, 2}],
            'string': ["str1", "str2", "", "12345", "I"],
            'object': [1, 2.0, False, np.nan, None]
        }
    )

    m = Model(df)
    df = m.parse_and_validate()

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

Vrame-1.0.3.tar.gz (9.9 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

Vrame-1.0.3-py3-none-any.whl (11.5 kB view details)

Uploaded Python 3

File details

Details for the file Vrame-1.0.3.tar.gz.

File metadata

  • Download URL: Vrame-1.0.3.tar.gz
  • Upload date:
  • Size: 9.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.11

File hashes

Hashes for Vrame-1.0.3.tar.gz
Algorithm Hash digest
SHA256 442fa0c253cce08a3adbe87349739873f8e69381debd8009f9bd2e743302a0d3
MD5 7b54f4a65489b8c332c9b9f33ade43f1
BLAKE2b-256 56c428144a7cb9e0c51a1c7365b106885ad3b4ca6bf35ffaefa302072595a946

See more details on using hashes here.

File details

Details for the file Vrame-1.0.3-py3-none-any.whl.

File metadata

  • Download URL: Vrame-1.0.3-py3-none-any.whl
  • Upload date:
  • Size: 11.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.10.11

File hashes

Hashes for Vrame-1.0.3-py3-none-any.whl
Algorithm Hash digest
SHA256 fea500594250dde9fd438cdd419c4417234d19a9b8cc662cd6f38d3229c2b125
MD5 5c46456946c77d0194055d8728e47363
BLAKE2b-256 08ae8e98f28f35762d0f66b94b13afabc0f8dc6b7d557217cdc64565337c4829

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page