Named Data Frames
Project description
namedframes
Basic type annotation support for Pandas and Spark data frames. The goal is to provide a convenient way to specify a name-to-type mapping. The assurance that the columns conform to the types is left to the user, i.e. this provides named data frames, not typed data frames.
Installation
pip install namedframes
Usage
Pandas
import pandas as pd
from namedframes import PandasNamedFrame
class InputDF(PandasNamedFrame):
x: float
class OutputDF(InputDF):
blah: bool
def transform(input_data: InputDF) -> OutputDF:
return OutputDF(input_data.assign(blah = True))
input_df = InputDF(pd.DataFrame({"x": [1.1, 2.2]}))
output = transform(input_df)
isinstance(input_df, InputDF)
True
isinstance(output, OutputDF)
True
If a column is missing, a validation error occurs,
OutputDF(input_df)
ValueError: missing columns: [('blah', <class 'bool'>)]
Spark
namedframes
includes an option for pyspark dataframes.
Using it requires installation of pyspark
. You can install this
separately or with the [pyspark]
flag to namedframes
, i.e.,
pip install namedframes[pyspark]
Example usage:
import pandas as pd
from pyspark.sql import SparkSession
from namedframes import SparkNamedFrame
class InputDF(SparkNamedFrame):
x: float
spark = SparkSession.builder.getOrCreate()
spark_df = spark.createDataFrame(pd.DataFrame({"x": [1.1, 2.2]}))
input_df = InputDF(spark_df)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file namedframes-0.1.4.tar.gz
.
File metadata
- Download URL: namedframes-0.1.4.tar.gz
- Upload date:
- Size: 2.9 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0042cb776af58735d4da2ec1e366f209a27b7676b36c92775827b30f646831fd |
|
MD5 | d0200e9a2e073c12a0f71549c8fd0b3f |
|
BLAKE2b-256 | 17aa624d7f9d07e1e0344e4c97e9368549d919ed8f77db6b1f8adcb0f766cf91 |
File details
Details for the file namedframes-0.1.4-py2.py3-none-any.whl
.
File metadata
- Download URL: namedframes-0.1.4-py2.py3-none-any.whl
- Upload date:
- Size: 4.3 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.6.1 requests/2.24.0 setuptools/50.3.0 requests-toolbelt/0.9.1 tqdm/4.48.2 CPython/3.7.5
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 81608496813cedcf4c6da53414662e95ac8a00e5a50dd45e11f71defc548d58d |
|
MD5 | 127ad98594b009ad116f77561b86c357 |
|
BLAKE2b-256 | 2a66457381c917ad8336c4bde7f4d9adb31adbb5f341bfd231920b20969fd9cf |