A configurable PySpark pipeline library.
Project description
Configurable PySpark Pipeline
A configurable PySpark pipeline library.
Getting Started
- Requirements:
- Python 3.5
- install the package using pip:
$ pip install sparkml-pipe
Project Organization
├── README.md <- The top-level README for developers using this project.
├── data <- Data for testing the library.
│
├── docs <- A default Sphinx project; see sphinx-doc.org for details
│
├── requirements.txt <- The requirements file for reproducing the analysis environment, e.g.
│ generated with `pip freeze > requirements.txt`
│
├── sparkmlpip <- Source code for use in this project.
│ ├── conf <- YAML config files for pyspark pipeline.
│ │
│ ├── pipeline <- pyspark model pipelines.
│ │
│ ├── stat <- pyspark stat pipelines.
│ │
│ ├── test <- test code.
│ │
│ └── utils <- util functions.
│
└── setup.py <- Metadata about your project for easy distribution.
Contributing
checkout the codebase
$ git checkout develop
Update the PyPI version
- Update sparkmlpipe/__version__.py if needed
- Upload to PyPI
$ python setup.py sdist
$ pip install twine
// upload to Test PyPI
$ twine upload --repository-url https://test.pypi.org/legacy/ dist/*
// upload to PyPI
$ twine upload dist/*
Installing development requirements
$ pip install -r requirements.txt
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
sparkml-pipe-0.1.12.tar.gz
(8.8 kB
view hashes)