Skip to main content

Type stubs for Python machine learning libraries

Project description

Mypy type stubs for numpy, pandas and matplotlib

This is a PEP-561-compliant stub-only package which provides type information for matplotlib, numpy and pandas. The mypy type checker (or pytype or PyCharm) can recognize the types in these packages by installing this package.

NOTE: This is a work in progress

Lots of functions are already typed, but a lot is still missing (numpy and pandas are huge libraries). Chances are you will see a message from Mypy claiming that a function does not exist when it actually does exist. If you encounter missing functions, we would be very happy for you to send a PR. If you are unsure of how to type a function, we can discuss it.

Installing

You can get this package from Pypi:

pip install data-science-types

To get the most up-to-date version, install it directly from GitHub:

pip install git+https://github.com/predictive-analytics-lab/data-science-types

Or clone the repository somewhere and do pip install -e ..

Examples

These are the kinds of things that can be checked:

Array creation

import numpy as np

arr1: np.ndarray[np.int64] = np.array([3, 7, 39, -3])  # OK
arr2: np.ndarray[np.int32] = np.array([3, 7, 39, -3])  # Type error
arr3: np.ndarray[np.int32] = np.array([3, 7, 39, -3], dtype=np.int32)  # OK
arr4: np.ndarray[float] = np.array([3, 7, 39, -3], dtype=float)  # Type error: the type of ndarray can not be just "float"
arr5: np.ndarray[np.float64] = np.array([3, 7, 39, -3], dtype=float)  # OK

Operations

import numpy as np

arr1: np.ndarray[np.int64] = np.array([3, 7, 39, -3])
arr2: np.ndarray[np.int64] = np.array([4, 12, 9, -1])

result1: np.ndarray[np.int64] = np.divide(arr1, arr2)  # Type error
result2: np.ndarray[np.float64] = np.divide(arr1, arr2)  # OK

compare: np.ndarray[np.bool_] = (arr1 == arr2)

Reductions

import numpy as np

arr: np.ndarray[np.float64] = np.array([[1.3, 0.7], [-43.0, 5.6]])

sum1: int = np.sum(arr)  # Type error
sum2: np.float64 = np.sum(arr)  # OK
sum3: float = np.sum(arr)  # Also OK: np.float64 is a subclass of float
sum4: np.ndarray[np.float64] = np.sum(arr, axis=0)  # OK

# the same works with np.max, np.min and np.prod

Philosophy

The goal is not to recreate the APIs exactly. The main goal is to have useful checks on our code. Often the actual APIs in the libraries is more permissive than the type signatures in our stubs; but this is (usually) a feature and not a bug.

Contributing

We always welcome contributions. All pull requests are subject to CI checks. We check for compliance with Mypy and that the file formatting conforms to our Black specification.

You can install these dev dependencies via

pip install -e .[dev]

This will also install numpy, pandas and matplotlib to be able to run the tests.

Pre-pull request checks

We also include a script that runs the CI checks that will be run when a PR is opened. To test these out locally, use the check_all.sh script.

./check_all.sh

Checking compliance with Mypy

The settings for Mypy are specified in the mypy.ini file in the repository. Just running

mypy tests

from the base directory should take these settings into account. We enforce 0 mypy errors.

Formatting with black

We use Black to format the stub files. First install black and then run

black -l 100 -t py36 -S .

from the base directory.

License

GPL 3

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

data-science-types-0.2.16.tar.gz (21.4 kB view details)

Uploaded Source

Built Distribution

data_science_types-0.2.16-py3-none-any.whl (41.9 kB view details)

Uploaded Python 3

File details

Details for the file data-science-types-0.2.16.tar.gz.

File metadata

  • Download URL: data-science-types-0.2.16.tar.gz
  • Upload date:
  • Size: 21.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.7

File hashes

Hashes for data-science-types-0.2.16.tar.gz
Algorithm Hash digest
SHA256 e3891679750b16693face6af1572bee3a3035a5f4aeb2d7fdb9e4563ba5d145b
MD5 6624c11d7c92349809e259ee99cc3570
BLAKE2b-256 70605c91d6a954b501750829e0cd517f636f6fa0de47a342eab83f1149a26576

See more details on using hashes here.

File details

Details for the file data_science_types-0.2.16-py3-none-any.whl.

File metadata

  • Download URL: data_science_types-0.2.16-py3-none-any.whl
  • Upload date:
  • Size: 41.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.47.0 CPython/3.7.7

File hashes

Hashes for data_science_types-0.2.16-py3-none-any.whl
Algorithm Hash digest
SHA256 f07ad28906702998c4a6a1e206b55824f510e467998aee02fc06bc79ce5e51fc
MD5 35f34cebe18da11298f3bab87cd0e11d
BLAKE2b-256 e61e7e8749adb1e519a247b38cfc7483bb1743e37a5e8404aceb18fdd78082e1

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page