Skip to main content

A configurable PySpark pipeline library.

Project description

# Configurable PySpark Pipeline
_A configurable PySpark pipeline library._

## Getting Started
* Requirements:
* Python 3.5
* install the package using pip:

``` bash
$ pip install sparkml-pipe
```

## Project Organization
```
├── README.md <- The top-level README for developers using this project.
├── data <- Data for testing the library.

├── docs <- A default Sphinx project; see sphinx-doc.org for details

├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`

├── sparkmlpip <- Source code for use in this project.
│ ├── conf <- YAML config files for pyspark pipeline.
│ │
│ ├── pipeline <- pyspark model pipelines.
│ │
│ ├── stat <- pyspark stat pipelines.
│ │
│ ├── test <- test code.
│ │
│ └── utils <- util functions.

└── setup.py <- Metadata about your project for easy distribution.
```



## Contributing
### checkout the codebase
``` bash
$ git checkout develop
```
### Update the PyPI version
* Update sparkmlpipe/\_\_version\_\_.py if needed
* Upload to PyPI
``` bash
$ python setup.py sdist
$ pip install twine
// upload to Test PyPI
$ twine upload --repository-url https://test.pypi.org/legacy/ dist/*
// upload to PyPI
$ twine upload dist/*
```
### Installing development requirements
``` bash
$ pip install -r requirements.txt
```

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sparkml-pipe-0.2.3.tar.gz (14.5 kB view details)

Uploaded Source

File details

Details for the file sparkml-pipe-0.2.3.tar.gz.

File metadata

  • Download URL: sparkml-pipe-0.2.3.tar.gz
  • Upload date:
  • Size: 14.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for sparkml-pipe-0.2.3.tar.gz
Algorithm Hash digest
SHA256 6a7160506f2f889913833dbac02eb9b00b0da1e7f4fc32fdabca0d09537f7ea4
MD5 57cbbd96b1d9c5652fe1ffd0def7e3b2
BLAKE2b-256 793a358cc280d0634282a542ebb0926254b4ca3efabbcfb76caf527f9d88386b

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page