faker-pyspark is a PySpark DataFrame and Schema provider for the Faker python package
Project description
PySpark provider for Faker
faker-pyspark
is a PySpark DataFrame and Schema (StructType) provider for the Faker
Python package.
Description
faker-pyspark
provides PySpark based fake data for testing purposes. The definition of "fake" in this context really means "random," as the data may look real. However, I make no claims about accuracy, so do not use this as real data!
Installation
Install with pip:
pip install faker-pyspark
Add as a provider to your Faker instance:
from faker import Faker
from faker_pyspark import PySparkProvider
fake = Faker()
fake.add_provider(PySparkProvider)
PySpark DataFrame, Schema and more
>>> df = fake.pyspark_dataframe()
>>> schema = fake.pyspark_schema()
>>> df_updated = fake.pyspark_update_dataframe(df)
>>> column_names = fake.pyspark_column_names()
>>> data = fake.pyspark_data_dict_using_schema(schema)
>>> data = fake.pyspark_data_dict()
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
faker_pyspark-0.6.0.tar.gz
(3.8 kB
view hashes)
Built Distribution
Close
Hashes for faker_pyspark-0.6.0-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0ed4e6ec8082962971696fadcd5bb539dfe8f5f9c810d9d413367f49a54960c6 |
|
MD5 | 013ebb53916cd785157d1813849a0e65 |
|
BLAKE2b-256 | c07bff5595a54b11a4ce0f6d9c208dc3fda1c4ebfd7e0068e2d6c54a497691e1 |