Marshmallow Schema generator for pandas and numpy
Project description
marshmallow-numerical
marshmallow-numerical
is a library that helps you generate
marshmallow Schemas for Pandas and Numpy
data structures.
Usage
Let's start by creating an example dataframe for which we want to create a
Schema
. This dataframe has four columns: two of them are of string type, one
is a float, and the last one is an integer.
import pandas as pd
import numpy as np
from marshmallow_numerical import SplitDataFrameSchema
animal_df = pd.DataFrame(
[
("falcon", "bird", 389.0, 2),
("parrot", "bird", 24.0, 2),
("lion", "mammal", 80.5, 4),
("monkey", "mammal", np.nan, 4),
],
columns=["name", "class", "max_speed", "num_legs"],
)
You can then create a marshmallow schema that will validate and load dataframes
that follow the same structure as the one above and that have been serialized
with DataFrame.to_json
with the orient=split
format:
class AnimalSchema(SplitDataFrameSchema):
"""Automatically generated schema for animal dataframe"""
dtypes = animal_df.dtypes
When passing a valid payload for a new animal, this schema will validate it and build a dataframe:
animal_schema = AnimalSchema()
new_animal = {
"data": [("leopard", "mammal", 58.0, 4), ("ant", "insect", 0.288, 6)],
"columns": ["name", "class", "max_speed", "num_legs"],
"index": [0, 1],
}
new_animal_df = animal_schema.load(new_animal)
print(type(new_animal_df))
# <class 'pandas.core.frame.DataFrame'>
print(new_animal_df)
# name class max_speed num_legs
# 0 leopard mammal 58.000 4
# 1 ant insect 0.288 6
However, if we pass a payload that doesn't conform to the schema, it will raise
a marshmallow ValidationError
exception with informative message about errors:
invalid_animal = {
"data": [("leopard", "mammal", 58.0, "four")], # num_legs is not an int
"columns": ["name", "class", "num_legs"], # missing max_speed column
"index": [0],
}
animal_schema.load(invalid_animal)
# Raises:
# marshmallow.exceptions.ValidationError: {'columns': ["Must be equal to ['name', 'class', 'max_speed', 'num_legs']."], 'data': {0: {3: ['Not a valid integer.']}}}
marshmallow_numerical
can also generate Schemas for the orient=records
format by following the above steps but using
marshmallow_numerical.RecordsDataFrameSchema
as the superclass for
AnimalSchema
.
Installation
marshmallow-numerical requires Python >= 3.6 and marshmallow >= 3.0. You can install it with pip:
pip install marshmallow-numerical
Contributing
Contributions are welcome!
You can report a problem or feature request in the issue tracker. If you feel that you can fix it or implement it, please submit a pull request referencing the issues it solves.
Unit tests written using the pytest
framework are in the
tests
directory, and are run using
tox on Python 3.6 and 3.7. You can run
the tests by installing tox:
pip install tox
and running the linters and tests for all Python versions by running tox
, or
for a specific Python version by running:
tox -e py36
We format the code with black, and you can format your checkout of the code before commiting it by running:
tox -e black -- .
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file marshmallow-numerical-0.0.2.tar.gz
.
File metadata
- Download URL: marshmallow-numerical-0.0.2.tar.gz
- Upload date:
- Size: 8.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | df64edb5ca342c769bbd1e98350cd6b16dcc07ea37e0d9609a0b2808f2694f2d |
|
MD5 | 42c6a175304cec5fd8c678dcc3a60b42 |
|
BLAKE2b-256 | 10883527264d9d6e9ff122dab809c89d386c92a2df4345182992a4183a01bbe0 |
File details
Details for the file marshmallow_numerical-0.0.2-py3-none-any.whl
.
File metadata
- Download URL: marshmallow_numerical-0.0.2-py3-none-any.whl
- Upload date:
- Size: 5.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/39.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | c2cad836752d8f0e95c902499628a57c82407d6e38778d49eaa87077c38ec38b |
|
MD5 | 83de9bb12a02443f3d2e0844f36ef2e6 |
|
BLAKE2b-256 | 4b156e2012a856cc21f1970d905cb19f4dd3dc6c7a80b2caa714e4c705908159 |