Skip to main content

Helper files/functions/classes for generic PySpark processes

Project description

toolbox-pyspark

github-release implementation version python-versions
os pypi-status pypi-format github-license pypi-downloads codecov-repo style
contributions
CI CD

Introduction

The purpose of this package is to provide some helper files/functions/classes for generic PySpark processes.

Key URLs

For reference, these URL's are used:

Type Source URL
Git Repo GitHub https://github.com/data-science-extensions/toolbox-pyspark
Python Package PyPI https://pypi.org/project/toolbox-pyspark
Package Docs Pages https://data-science-extensions.com/toolbox-pyspark/

Installation

You can install and use this package multiple ways by using pip, pipenv, or poetry.

Using pip:

  1. In your terminal, run:

    python3 -m pip install --upgrade pip
    python3 -m pip install toolbox-pyspark
    
  2. Or, in your requirements.txt file, add:

    toolbox-pyspark
    

    Then run:

    python3 -m pip install --upgrade pip
    python3 -m pip install --requirement=requirements.txt
    

Using pipenv:

  1. Install using environment variables:

    In your Pipfile file, add:

    [[source]]
    url = "https://pypi.org/simple"
    verify_ssl = false
    name = "pypi"
    
    [packages]
    toolbox-pyspark = "*"
    

    Then run:

    python3 -m pip install pipenv
    python3 -m pipenv install --verbose --skip-lock --categories=root index=pypi toolbox-pyspark
    
  2. Or, in your requirements.txt file, add:

    toolbox-pyspark
    

    Then run:

    python3 -m run pipenv install --verbose --skip-lock --requirements=requirements.txt
    
  3. Or just run this:

    python3 -m pipenv install --verbose --skip-lock toolbox-pyspark
    

Using poetry:

  1. In your pyproject.toml file, add:

    [tool.poetry.dependencies]
    toolbox-pyspark = "*"
    

    Then run:

    poetry install
    
  2. Or just run this:

    poetry add toolbox-pyspark
    poetry install
    poetry sync
    

Contribution

Contribution is always welcome.

  1. First, either fork or branch the main repo.

  2. Clone your forked/branched repo.

  3. Build your environment:

    1. With pipenv on Windows:

      if (-not (Test-Path .venv)) {mkdir .venv}
      python -m pipenv install --requirements requirements.txt --requirements requirements-dev.txt --skip-lock
      python -m poetry run pre-commit install
      python -m poetry shell
      
    2. With pipenv on Linux:

      mkdir .venv
      python3 -m pipenv install --requirements requirements.txt --requirements requirements-dev.txt --skip-lock
      python3 -m poetry run pre-commit install
      python3 -m poetry shell
      
    3. With poetry on Windows:

      python -m pip install --upgrade pip
      python -m pip install poetry
      python -m poetry init
      python -m poetry add $(cat requirements/root.txt)
      python -m poetry add --group=dev $(cat requirements/dev.txt)
      python -m poetry add --group=test $(cat requirements/test.txt)
      python -m poetry add --group=docs $(cat requirements/docs.txt)
      python -m poetry install
      python -m poetry run pre-commit install
      python -m poetry shell
      
    4. With poetry on Linux:

      python3 -m pip install --upgrade pip
      python3 -m pip install poetry
      python3 -m poetry init
      python3 -m poetry add $(cat requirements/root.txt)
      python3 -m poetry add --group=dev $(cat requirements/dev.txt)
      python3 -m poetry add --group=test $(cat requirements/test.txt)
      python3 -m poetry add --group=docs $(cat requirements/docs.txt)
      python3 -m poetry install
      python3 -m poetry run pre-commit install
      python3 -m poetry shell
      
  4. Start contributing.

  5. When you're happy with the changes, raise a Pull Request to merge with the main branch again.

Build and Test

To ensure that the package is working as expected, please ensure that:

  1. You write your code as per PEP8 requirements.
  2. You write a UnitTest for each function/feature you include.
  3. The CodeCoverage is 100%.
  4. All UnitTests are passing.
  5. MyPy is passing 100%.

Testing

  • Run them all together

    poetry run make check
    
  • Or run them individually:

    • Black

      poetry run make check-black
      
    • PyTests:

      poetry run make ckeck-pytest
      
    • MyPy:

      poetry run make check-mypy
      

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

toolbox_pyspark-0.1.0.tar.gz (14.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

toolbox_pyspark-0.1.0-py3-none-any.whl (13.1 kB view details)

Uploaded Python 3

File details

Details for the file toolbox_pyspark-0.1.0.tar.gz.

File metadata

  • Download URL: toolbox_pyspark-0.1.0.tar.gz
  • Upload date:
  • Size: 14.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.13.1 Linux/6.5.0-1025-azure

File hashes

Hashes for toolbox_pyspark-0.1.0.tar.gz
Algorithm Hash digest
SHA256 d2f49b712f3bdbae0e2b2e8c4f0d6c476886f4f25f2b4fbcf139f92b70649741
MD5 79bc5ecda5baabad88c171b926f782a6
BLAKE2b-256 ef7b931c44bbef608b073bce046ae2eeb690080c4eb29c8b2adf97778775702a

See more details on using hashes here.

File details

Details for the file toolbox_pyspark-0.1.0-py3-none-any.whl.

File metadata

  • Download URL: toolbox_pyspark-0.1.0-py3-none-any.whl
  • Upload date:
  • Size: 13.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.8.5 CPython/3.13.1 Linux/6.5.0-1025-azure

File hashes

Hashes for toolbox_pyspark-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 f8bec15d68518f367b429cad0040dd20db6e094c2bfcb834a71daf8bb5e4d2e8
MD5 6ec6cd090c00b2eebddb18997bbde7f8
BLAKE2b-256 8ee9b357355b90cb7740d9d8debb3d7a430f4cb206174a5afa357e5f11d27afc

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page