Skip to main content

A configurable PySpark pipeline library.

Project description

# Configurable PySpark Pipeline
_A configurable PySpark pipeline library._

## Getting Started
* Requirements:
* Python 3.5
* install the package using pip:

``` bash
$ pip install sparkml-pipe
```

## Project Organization
```
├── README.md <- The top-level README for developers using this project.
├── data <- Data for testing the library.

├── docs <- A default Sphinx project; see sphinx-doc.org for details

├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`

├── sparkmlpip <- Source code for use in this project.
│ ├── conf <- YAML config files for pyspark pipeline.
│ │
│ ├── pipeline <- pyspark model pipelines.
│ │
│ ├── stat <- pyspark stat pipelines.
│ │
│ ├── test <- test code.
│ │
│ └── utils <- util functions.

└── setup.py <- Metadata about your project for easy distribution.
```



## Contributing
### checkout the codebase
``` bash
$ git checkout develop
```
### Update the PyPI version
* Update sparkmlpipe/\_\_version\_\_.py if needed
* Upload to PyPI
``` bash
$ python setup.py sdist
$ pip install twine
// upload to Test PyPI
$ twine upload --repository-url https://test.pypi.org/legacy/ dist/*
// upload to PyPI
$ twine upload dist/*
```
### Installing development requirements
``` bash
$ pip install -r requirements.txt
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparkml-pipe-0.2.5.tar.gz (14.6 kB view details)

Uploaded Source

File details

Details for the file sparkml-pipe-0.2.5.tar.gz.

File metadata

  • Download URL: sparkml-pipe-0.2.5.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for sparkml-pipe-0.2.5.tar.gz
Algorithm Hash digest
SHA256 f8efbf32b0e4d4561071e20fbe4260be163946c29e244e4d13c30cc8e82527d3
MD5 7e36c90eaadffb6e53bb2c604e9a799c
BLAKE2b-256 6913366a94704de18c938aa233793c4c6429bef033eb9ebccd27b7ee5aa39644

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page