Skip to main content

pyspark-sampling

Project description

This is a Python Grpc Stub for sparksampling

sparksampling

sparksampling is a PySpark-based sampling and data quality assessment GRPC service that supports containerized deployments and Spark on K8S

Feature

  • Common sampling methods: Random, Stratified, Simple

  • Relationship Sampling based on DAG and Topological sorting

  • Cloud Native and Spark on K8S support

QUICK START

Installation

The trial only requires direct installation using pypi

pip install sparksampling

run as

sparksampling

The service will start and listen on port 8530

Docker

docker run -p 8530:8530 wh1isper/pysparksampling:latest

MORE

For more, see our github page: https://github.com/Wh1isper/pyspark-sampling/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparksampling_client-0.1.1.tar.gz (3.3 kB view details)

Uploaded Source

Built Distribution

sparksampling_client-0.1.1-py3-none-any.whl (4.7 kB view details)

Uploaded Python 3

File details

Details for the file sparksampling_client-0.1.1.tar.gz.

File metadata

  • Download URL: sparksampling_client-0.1.1.tar.gz
  • Upload date:
  • Size: 3.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.8.13

File hashes

Hashes for sparksampling_client-0.1.1.tar.gz
Algorithm Hash digest
SHA256 913bb379842abafd1cc250288b017c5661e6cf6d7f69565e30ce9da6e1207ddc
MD5 99436fb33f1f5314f29dcae90a521284
BLAKE2b-256 d82a0f48e936b23fd24f42baa50d226911512dd72cb39443c4a1f59f1a151d96

See more details on using hashes here.

File details

Details for the file sparksampling_client-0.1.1-py3-none-any.whl.

File metadata

File hashes

Hashes for sparksampling_client-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 4643b1060696b7d22d599f7cb95a25a0511cb1fd6d7e587e580c760c79554aa3
MD5 8879617c8403aceb51970de8219938de
BLAKE2b-256 31a8ca666f620d183155e5798a5738fc58464c09e4fd4d36014e87a6ce836ba5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page