4 projects
ufs
Unified File System - Object Oriented way to work seamlessly between Posix and S3 filesystems
faker-pyspark
faker-pyspark is a PySpark DataFrame and Schema provider for the Faker python package
pyspark-delta-scd2
This project utilizes faker-pyspark to generate random schema and dataframes to mimic data table snapshots. Using these snapshots to process and apply SCD2 pattern into delta table as the destination.
RemoteDiff
Check filesystem/ownership/permission/size differences in remote/local paths.