Skip to main content

A configurable PySpark pipeline library.

Project description

# Configurable PySpark Pipeline
_A configurable PySpark pipeline library._

## Getting Started
* Requirements:
* Python 3.5
* install the package using pip:

``` bash
$ pip install sparkml-pipe
```

## Project Organization
```
├── README.md <- The top-level README for developers using this project.
├── data <- Data for testing the library.

├── docs <- A default Sphinx project; see sphinx-doc.org for details

├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`

├── sparkmlpip <- Source code for use in this project.
│ ├── conf <- YAML config files for pyspark pipeline.
│ │
│ ├── pipeline <- pyspark model pipelines.
│ │
│ ├── stat <- pyspark stat pipelines.
│ │
│ ├── test <- test code.
│ │
│ └── utils <- util functions.

└── setup.py <- Metadata about your project for easy distribution.
```



## Contributing
### checkout the codebase
``` bash
$ git checkout develop
```
### Update the PyPI version
* Update sparkmlpipe/\_\_version\_\_.py if needed
* Upload to PyPI
``` bash
$ python setup.py sdist
$ pip install twine
// upload to Test PyPI
$ twine upload --repository-url https://test.pypi.org/legacy/ dist/*
// upload to PyPI
$ twine upload dist/*
```
### Installing development requirements
``` bash
$ pip install -r requirements.txt
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparkml-pipe-0.2.4.tar.gz (14.5 kB view details)

Uploaded Source

File details

Details for the file sparkml-pipe-0.2.4.tar.gz.

File metadata

  • Download URL: sparkml-pipe-0.2.4.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for sparkml-pipe-0.2.4.tar.gz
Algorithm Hash digest
SHA256 f3294bbf9b116dda3d44ac06551a101cd40657c9d9e96e452b524158dae6f182
MD5 2bc076cdde8b140b502f4d9b30dab1d6
BLAKE2b-256 677ba110a92e3753ef9edee6e9e12e1eae5cf50a1adeeb2d91c76608b455e432

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page