Skip to main content

Marshmallow Schema generator for pandas and numpy

Project description

marshmallow-numerical

Build Status PyPI License

marshmallow-numerical is a library that helps you generate marshmallow Schemas for Pandas and Numpy data structures.

Usage

Let's start by creating an example dataframe for which we want to create a Schema. This dataframe has four columns: two of them are of string type, one is a float, and the last one is an integer.

import pandas as pd
import numpy as np
from marshmallow_numerical import SplitDataFrameSchema

animal_df = pd.DataFrame(
    [
        ("falcon", "bird", 389.0, 2),
        ("parrot", "bird", 24.0, 2),
        ("lion", "mammal", 80.5, 4),
        ("monkey", "mammal", np.nan, 4),
    ],
    columns=["name", "class", "max_speed", "num_legs"],
)

You can then create a marshmallow schema that will validate and load dataframes that follow the same structure as the one above and that have been serialized with DataFrame.to_json with the orient=split format:

class AnimalSchema(SplitDataFrameSchema):
    """Automatically generated schema for animal dataframe"""

    dtypes = animal_df.dtypes

When passing a valid payload for a new animal, this schema will validate it and build a dataframe:

animal_schema = AnimalSchema()

new_animal = {
    "data": [("leopard", "mammal", 58.0, 4), ("ant", "insect", 0.288, 6)],
    "columns": ["name", "class", "max_speed", "num_legs"],
    "index": [0, 1],
}

new_animal_df = animal_schema.load(new_animal)

print(type(new_animal_df))
# <class 'pandas.core.frame.DataFrame'>
print(new_animal_df)
#       name   class  max_speed  num_legs
# 0  leopard  mammal     58.000         4
# 1      ant  insect      0.288         6

However, if we pass a payload that doesn't conform to the schema, it will raise a marshmallow ValidationError exception with informative message about errors:

invalid_animal = {
    "data": [("leopard", "mammal", 58.0, "four")],  # num_legs is not an int
    "columns": ["name", "class", "num_legs"],  # missing  max_speed column
    "index": [0],
}

animal_schema.load(invalid_animal)

# Raises:
# marshmallow.exceptions.ValidationError: {'columns': ["Must be equal to ['name', 'class', 'max_speed', 'num_legs']."], 'data': {0: {3: ['Not a valid integer.']}}}

marshmallow_numerical can also generate Schemas for the orient=records format by following the above steps but using marshmallow_numerical.RecordsDataFrameSchema as the superclass for AnimalSchema.

Installation

marshmallow-numerical requires Python >= 3.6 and marshmallow >= 3.0. You can install it with pip:

pip install marshmallow-numerical

Contributing

Contributions are welcome!

You can report a problem or feature request in the issue tracker. If you feel that you can fix it or implement it, please submit a pull request referencing the issues it solves.

Unit tests written using the pytest framework are in the tests directory, and are run using tox on Python 3.6 and 3.7. You can run the tests by installing tox:

pip install tox

and running the linters and tests for all Python versions by running tox, or for a specific Python version by running:

tox -e py36

We format the code with black, and you can format your checkout of the code before commiting it by running:

tox -e black -- .

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

marshmallow-numerical-0.0.2.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

marshmallow_numerical-0.0.2-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

File details

Details for the file marshmallow-numerical-0.0.2.tar.gz.

File metadata

  • Download URL: marshmallow-numerical-0.0.2.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0

File hashes

Hashes for marshmallow-numerical-0.0.2.tar.gz
Algorithm Hash digest
SHA256 df64edb5ca342c769bbd1e98350cd6b16dcc07ea37e0d9609a0b2808f2694f2d
MD5 42c6a175304cec5fd8c678dcc3a60b42
BLAKE2b-256 10883527264d9d6e9ff122dab809c89d386c92a2df4345182992a4183a01bbe0

See more details on using hashes here.

File details

Details for the file marshmallow_numerical-0.0.2-py3-none-any.whl.

File metadata

  • Download URL: marshmallow_numerical-0.0.2-py3-none-any.whl
  • Upload date:
  • Size: 5.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0

File hashes

Hashes for marshmallow_numerical-0.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 c2cad836752d8f0e95c902499628a57c82407d6e38778d49eaa87077c38ec38b
MD5 83de9bb12a02443f3d2e0844f36ef2e6
BLAKE2b-256 4b156e2012a856cc21f1970d905cb19f4dd3dc6c7a80b2caa714e4c705908159

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page