Skip to main content

A flake8 plugin to lint pandas in an opinionated way

Project description

pandas-vet

tests codecov Code style: black PyPI - License

PyPI PyPI - Status PyPI - Downloads

Conda Version Conda Downloads

pandas-vet is a plugin for flake8 that provides opinionated linting for pandas code.

It began as a project during the PyCascades 2019 sprints.

Motivation

Starting with pandas can be daunting. The usual internet help sites are littered with different ways to do the same thing and some features that the pandas docs themselves discourage live on in the API. pandas-vet is (hopefully) a way to help make pandas a little more friendly for newcomers by taking some opinionated stances about pandas best practices. It is designed to help users reduce the pandas universe.

The idea to create a linter was sparked by Ania Kapuścińska's talk at PyCascades 2019, "Lint your code responsibly!".

Many of the opinions stem from Ted Petrou's excellent Minimally Sufficient Pandas. Other ideas are drawn from pandas docs or elsewhere. The Pandas in Black and White flashcards have a lot of the same opinions too.

Installation

pandas-vet is a plugin for flake8. If you don't have flake8 already, it will install automatically when you install pandas-vet.

The plugin is on PyPI and can be installed with:

pip install pandas-vet

It can also be installed with conda:

conda install -c conda-forge pandas-vet

pandas-vet is tested under Python 3.6, 3.7, 3.8, and 3.9 as defined in our GitHub Actions

Usage

Once installed successfully in an environment that also has flake8 installed, pandas-vet should run whenever flake8 is run.

$ flake8 ...

See the flake8 docs for more information.

For a full list of implemented warnings, see the list below.

Contributing

pandas-vet is still in the very early stages. Contributions are welcome from the community on code, tests, docs, and just about anything else.

Code of Conduct

Because this project started during the PyCascades 2019 sprints, we adopt the PyCascades minimal expectation that we "Be excellent to each another". Beyond that, we follow the Python Software Foundation's Community Code of Conduct.

Steps to contributing

  1. Please submit an issue (or draft PR) first describing the types of changes you'd like to implement.

  2. Fork the repo and create a new branch for your enhancement/fix.

  3. Get a development environment set up with your favorite environment manager (conda, virtualenv, etc.).

    1. You must use at least python 3.6 to develop, for black support.

    2. You can create one from pip install -r requirements_dev.txt or, if you use Docker, you can build an image from the Dockerfile included in this repo.

    3. Once your enviroment is set up you will need to install pandas-vet in development mode. Use pip install -e . (use this if you are alreay in your virtual enviroment) or pip install -e <path> (use this one if not in the virtual enviroment and prefer to state explicitly where it is going).

  4. Write code, docs, etc.

  5. We use pytest, flake8, and black to validate our codebase. TravisCI integration will complain on pull requests if there are any failing tests or lint violations. To check these locally, run the following commands:

pytest --cov="pandas_vet"
flake8 pandas_vet setup.py tests --exclude tests/data
black --check pandas_vet setup.py tests --exclude tests/data
  1. Push to your forked repo.

  2. Submit pull request to the parent repo from your branch. Be sure to write a clear message and reference the Issue # that relates to your pull request.

  3. Feel good about giving back to open source projects.

How to add a check to the linter

  1. Write tests. At a minimum, you should have test cases where the linter should catch "bad" pandas and test cases where the linter should allow "good" pandas.

  2. Write your check function in /pandas-vet/__init__.py.

  3. Run flake8 and pytest on the linter itself (see Steps to contributing)

Contributors

PyCascades 2019 sprints team

PyCascades 2020 sprints team

Other awesome contributors

  • Earl Clark
  • Leandro Leites
  • pwoolvett
  • sigmavirus24

List of warnings

PD001: pandas should always be imported as 'import pandas as pd'

PD002: 'inplace = True' should be avoided; it has inconsistent behavior

PD003: '.isna' is preferred to '.isnull'; functionality is equivalent

PD004: '.notna' is preferred to '.notnull'; functionality is equivalent

PD005: Use arithmetic operator instead of method

PD006: Use comparison operator instead of method

PD007: '.ix' is deprecated; use more explicit '.loc' or '.iloc'

PD008: Use '.loc' instead of '.at'. If speed is important, use numpy.

PD009: Use '.iloc' instead of '.iat'. If speed is important, use numpy.

PD010 '.pivot_table' is preferred to '.pivot' or '.unstack'; provides same functionality

PD011 Use '.array' or '.to_array()' instead of '.values'; 'values' is ambiguous

PDO12 '.read_csv' is preferred to '.read_table'; provides same functionality

PD013 '.melt' is preferred to '.stack'; provides same functionality

PD015 Use '.merge' method instead of 'pd.merge' function. They have equivalent functionality.

Very Opinionated Warnings

These warnings are turned off by default. To enable them, add the -annoy flag to your command, e.g.,

$ flake8 --annoy my_file.py

PD901 'df' is a bad variable name. Be kinder to your future self.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pandas-vet-0.2.3.tar.gz (8.7 kB view details)

Uploaded Source

Built Distribution

pandas_vet-0.2.3-py3-none-any.whl (8.9 kB view details)

Uploaded Python 3

File details

Details for the file pandas-vet-0.2.3.tar.gz.

File metadata

  • Download URL: pandas-vet-0.2.3.tar.gz
  • Upload date:
  • Size: 8.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for pandas-vet-0.2.3.tar.gz
Algorithm Hash digest
SHA256 58b64027a4c192b4b62272c1d8fdecc1733352452401282b697c1a32abe4656a
MD5 5c9671388aea71f764763b14076061f5
BLAKE2b-256 157c94314afc513c6c4cfc798093e95d57671ce2c3334f784a56ad31ea76b4b6

See more details on using hashes here.

File details

Details for the file pandas_vet-0.2.3-py3-none-any.whl.

File metadata

  • Download URL: pandas_vet-0.2.3-py3-none-any.whl
  • Upload date:
  • Size: 8.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/32.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.8 tqdm/4.62.3 importlib-metadata/4.10.1 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.10

File hashes

Hashes for pandas_vet-0.2.3-py3-none-any.whl
Algorithm Hash digest
SHA256 349e4240399ead316f64f9afc8e94a5bd5cfff45d7f448c5c22989e86c4ac782
MD5 3d067792ddc876f89b7bc41bc55c49d1
BLAKE2b-256 c623299811a57f4225a8419b3488265423f6b9a3d68f53bbb630989e5adc1b77

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page