Type stubs for Python machine learning libraries
Project description
Mypy type stubs for numpy, pandas and matplotlib
This is a PEP561compliant stubonly package which provides type information for matplotlib, numpy and pandas. The mypy type checker (or pytype or PyCharm) can recognize the types in these packages by installing this package.
NOTE: This is a work in progress
Lots of functions are already typed, but a lot is still missing (numpy and pandas are huge libraries). Chances are you will see a message from Mypy claiming that a function does not exist when it actually does exist. If you encounter missing functions, we would be very happy for you to send a PR. If you are unsure of how to type a function, we can discuss it.
Installing
You can get this package from Pypi:
pip install datasciencetypes
To get the most uptodate version, install it directly from GitHub:
pip install git+https://github.com/predictiveanalyticslab/datasciencetypes
Or clone the repository somewhere and do pip install e .
.
Examples
These are the kinds of things that can be checked:
Array creation
import numpy as np arr1: np.ndarray[np.int64] = np.array([3, 7, 39, 3]) # OK arr2: np.ndarray[np.int32] = np.array([3, 7, 39, 3]) # Type error arr3: np.ndarray[np.int32] = np.array([3, 7, 39, 3], dtype=np.int32) # OK arr4: np.ndarray[float] = np.array([3, 7, 39, 3], dtype=float) # Type error: the type of ndarray can not be just "float" arr5: np.ndarray[np.float64] = np.array([3, 7, 39, 3], dtype=float) # OK
Operations
import numpy as np arr1: np.ndarray[np.int64] = np.array([3, 7, 39, 3]) arr2: np.ndarray[np.int64] = np.array([4, 12, 9, 1]) result1: np.ndarray[np.int64] = np.divide(arr1, arr2) # Type error result2: np.ndarray[np.float64] = np.divide(arr1, arr2) # OK compare: np.ndarray[np.bool_] = (arr1 == arr2)
Reductions
import numpy as np arr: np.ndarray[np.float64] = np.array([[1.3, 0.7], [43.0, 5.6]]) sum1: int = np.sum(arr) # Type error sum2: np.float64 = np.sum(arr) # OK sum3: float = np.sum(arr) # Also OK: np.float64 is a subclass of float sum4: np.ndarray[np.float64] = np.sum(arr, axis=0) # OK # the same works with np.max, np.min and np.prod
Philosophy
The goal is not to recreate the APIs exactly. The main goal is to have useful checks on our code. Often the actual APIs in the libraries is more permissive than the type signatures in our stubs; but this is (usually) a feature and not a bug.
Contributing
We always welcome contributions. All pull requests are subject to CI checks. We check for compliance with Mypy and that the file formatting conforms to our Black specification.
You can install these dev dependencies via
pip install e '.[dev]'
This will also install numpy, pandas and matplotlib to be able to run the tests.
Running CI locally (recommended)
We include a script that runs the CI checks that will be run when a PR is opened. To test these out locally, you need to install the type stubs in your environment. Typically, you would do this with
pip install e .
Then use the check_all.sh
script to run all tests:
./check_all.sh
Below we describe how to run the various checks individually,
but check_all.sh
should be easier to use.
Checking compliance with Mypy
The settings for Mypy are specified in the mypy.ini
file in the repository.
Just running
mypy tests
from the base directory should take these settings into account. We enforce 0 mypy errors.
Formatting with black
We use Black to format the stub files.
First install black
and then run
black .
from the base directory.
Pytest
python m pytest vv tests/
Flake8
flake8 *stubs
License
Project details
Release history Release notifications  RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Filename, size  File type  Python version  Upload date  Hashes 

Filename, size data_science_types0.2.22py3noneany.whl (42.1 kB)  File type Wheel  Python version py3  Upload date  Hashes View 
Filename, size datasciencetypes0.2.22.tar.gz (35.3 kB)  File type Source  Python version None  Upload date  Hashes View 
Hashes for data_science_types0.2.22py3noneany.whl
Algorithm  Hash digest  

SHA256  065475a4c5ee9b5f30883e65cdcffdd1b09a9d2e203857af25d21ffb336fece0 

MD5  77d6dbf79d07921f5f57b2b75c5367d5 

BLAKE2256  73bcbc458d8872efd790554ae8acdfbfcc9396425305b171d696895fa4c2126c 
Hashes for datasciencetypes0.2.22.tar.gz
Algorithm  Hash digest  

SHA256  5b6bb2ca4cb6b8c467c799382e12156816f38122522bdaafd882d5a3828c7345 

MD5  aa4f41e994a12eedf2a62ced5b4f5587 

BLAKE2256  ae1ed0c3c7bd8216eacf3857c0675092af63503104ac1c7504b9c5e2e3dd32fb 