Designed to once and for all collect all the little things that come up over and over again in AI projects and put them in one place.
Project description
not-again-ai
Have you ever been working on a project and groaned as you go to search again on how to nicely plot a simple distribution? Or have you been frustrated at wanting to run multiple functions in parallel, but stuck between the seemingly ten different ways to do it? not-again-ai is a Python package designed to once and for all collect all these little things that come up over and over again in AI projects and put them in one place.
Documentation available at DaveCoDev.github.io/not-again-ai/.
Installation
Requires: Python 3.9, 3.10
Install from PyPI
$ pip install not_again_ai
Quick tour
Visualization
We currently offer two visualization tools, a time series plot and a histogram for plotting univariate distributions.
>>> import numpy as np
>>> import pandas as pd
>>> from not_again_ai.viz.time_series import ts_lineplot
>>> from not_again_ai.viz.distributions import univariate_distplot
# get some time series data
>>> rs = np.random.RandomState(365)
>>> values = rs.randn(365, 4).cumsum(axis=0)
>>> dates = pd.date_range('1 1 2021', periods=365, freq='D')
# plot the time series and save it to a file
>>> ts_lineplot(ts_data=values, save_pathname='myplot.png', ts_x=dates, ts_names=['A', 'B', 'C', 'D'])
# get a random distribution
>>> distrib = np.random.beta(a=0.5, b=0.5, size=1000)
# plot the distribution and save it to a file
>>> univariate_distplot(
... data=distrib,
... save_pathname='mydistribution.svg',
... print_summary=False, bins=100,
... title=r'Beta Distribution $\alpha=0.5, \beta=0.5$'
... )
Parallel
For when you have functions you want to execute in parallel.
# execute the passed in functions in parallel without any additional arguments
>>> result = embarrassingly_parallel_simple([do_something, do_something2], num_processes=2)
# execute the function mult in parallel with the passed in arguments
>>> args = ((2, 2), (3, 3), (4, 4))
>>> result2 = embarrassingly_parallel(mult, args, num_processes=multiprocessing.cpu_count())
Filesystem
We provide helpers to deal with files and directories easily and without raising unnecessary errors.
>>> from not_again_ai.system.files import create_file_dir
# creates the directory mydir if it does not exist
>>> create_file_dir('mydir/myfile.txt')
Data Analysis
We provide a few helpers for data analysis.
from not_again_ai.data_analysis.dependence import pearson_correlation
# quadratic dependence
>>> x = (rs.rand(500) * 4) - 2
>>> y = x**2 + (rs.randn(500) * 0.2)
>>> pearson_correlation(x, y)
0.05
Development Information
This package uses Poetry to manage dependencies and isolated Python virtual environments.
To proceed, install Poetry globally onto your system.
(Optional) configure Poetry to use an in-project virtual environment.
$ poetry config virtualenvs.in-project true
Dependencies
Dependencies are defined in pyproject.toml
and specific versions are locked
into poetry.lock
. This allows for exact reproducible environments across
all machines that use the project, both during development and in production.
To install all dependencies into an isolated virtual environment:
Append
--sync
to uninstall dependencies that are no longer in use from the virtual environment.
$ poetry install
To activate the virtual environment that is automatically created by Poetry:
$ poetry shell
To deactivate the environment:
(.venv) $ exit
To upgrade all dependencies to their latest versions:
$ poetry update
Packaging
This project is designed as a Python package, meaning that it can be bundled up and redistributed as a single compressed file.
Packaging is configured by:
To package the project as both a source distribution and a wheel:
$ poetry build
This will generate dist/not-again-ai-0.1.0.tar.gz
and dist/not_again_ai-0.1.0-py3-none-any.whl
.
Read more about the advantages of wheels to understand why generating wheel distributions are important.
Publish Distributions to PyPI
Source and wheel redistributable packages can
be published to PyPI or installed
directly from the filesystem using pip
.
$ poetry publish
Enforcing Code Quality
Automated code quality checks are performed using
Nox and
nox-poetry
. Nox will automatically create virtual
environments and run commands based on noxfile.py
for unit testing, PEP 8 style
guide checking, type checking and documentation generation.
Note:
nox
is installed into the virtual environment automatically by thepoetry install
command above. Runpoetry shell
to activate the virtual environment.
To run all default sessions:
(.venv) $ nox
Unit Testing
Unit testing is performed with pytest. pytest has become the de facto Python unit testing framework. Some key advantages over the built-in unittest module are:
- Significantly less boilerplate needed for tests.
- PEP 8 compliant names (e.g.
pytest.raises()
instead ofself.assertRaises()
). - Vibrant ecosystem of plugins.
pytest will automatically discover and run tests by recursively searching for folders and .py
files prefixed with test
for any functions prefixed by test
.
The tests
folder is created as a Python package (i.e. there is an __init__.py
file within it)
because this helps pytest
uniquely namespace the test files. Without this, two test files cannot
be named the same, even if they are in different subdirectories.
Code coverage is provided by the pytest-cov plugin.
When running a unit test Nox session (e.g. nox -s test
), an HTML report is generated in
the htmlcov
folder showing each source file and which lines were executed during unit testing.
Open htmlcov/index.html
in a web browser to view the report. Code coverage reports help identify
areas of the project that are currently not tested.
pytest and code coverage are configured in pyproject.toml
.
To pass arguments to pytest
through nox
:
(.venv) $ nox -s test -- -k invalid_factorial
Code Style Checking
PEP 8 is the universally accepted style guide for
Python code. PEP 8 code compliance is verified using Flake8. Flake8 is
configured in the [tool.flake8]
section of pyproject.toml
. Extra Flake8 plugins are also
included:
flake8-bugbear
: Find likely bugs and design problems in your program.flake8-broken-line
: Forbid using backslashes (\
) for line breaks.flake8-comprehensions
: Helps write betterlist
/set
/dict
comprehensions.pep8-naming
: Ensure functions, classes, and variables are named with correct casing.pyproject-flake8
: Allow configuration offlake8
throughpyproject.toml
.
Some code style settings are included in .editorconfig
and will be
configured automatically in editors such as PyCharm.
To lint code, run:
(.venv) $ nox -s lint
Automated Code Formatting
Code is automatically formatted using black. Imports are automatically sorted and grouped using isort.
These tools are configured by:
To automatically format code, run:
(.venv) $ nox -s fmt
To verify code has been formatted, such as in a CI job:
(.venv) $ nox -s fmt_check
Type Checking
Type annotations allows developers to include optional static typing information to Python source code. This allows static analyzers such as mypy, PyCharm, or Pyright to check that functions are used with the correct types before runtime.
def factorial(n: int) -> int:
...
mypy is configured in pyproject.toml
. To type check code, run:
(.venv) $ nox -s type_check
See also awesome-python-typing.
Distributing Type Annotations
PEP 561 defines how a Python package should communicate the presence of inline type annotations to static type checkers. mypy's documentation provides further examples on how to do this.
Mypy looks for the existence of a file named py.typed
in the root of the
installed package to indicate that inline type annotations should be checked.
Continuous Integration
Continuous integration is provided by GitHub Actions. This runs all tests, lints, and type checking for every commit and pull request to the repository.
GitHub Actions is configured in .github/workflows/python.yml
.
Visual Studio Code
Install the Python extension for VSCode.
Default settings are configured in .vscode/settings.json
. This will enable flake8 and black formatting with consistent settings.
Documentation
Generating a User Guide
Material for MkDocs is a powerful static site generator that combines easy-to-write Markdown, with a number of Markdown extensions that increase the power of Markdown. This makes it a great fit for user guides and other technical documentation.
The example MkDocs project included in this project is configured to allow the built documentation to be hosted at any URL or viewed offline from the file system.
To build the user guide, run,
(.venv) $ nox -s docs
and open docs/user_guide/site/index.html
using a web browser.
To build and serve the user guide with automatic rebuilding as you change the contents, run:
(.venv) $ nox -s docs_serve
and open http://127.0.0.1:8000 in a browser.
Each time the main
Git branch is updated, the
.github/workflows/pages.yml
GitHub Action will
automatically build the user guide and publish it to GitHub Pages.
This is configured in the docs_github_pages
Nox session.
Generating API Documentation
This project uses mkdocstrings plugin for MkDocs, which renders Google-style docstrings into an MkDocs project. Google-style docstrings provide a good mix of easy-to-read docstrings in code as well as nicely-rendered output.
"""Computes the factorial through a recursive algorithm.
Args:
n: A positive input value.
Raises:
InvalidFactorialError: If n is less than 0.
Returns:
Computed factorial.
"""
Misc
If you get a Failed to create the collection: Prompt dismissed..
error when running poetry update
on Ubuntu, try setting the following environment variable:
```bash
export PYTHON_KEYRING_BACKEND=keyring.backends.null.Keyring
```
Attributions
python-blueprint for the Python package skeleton.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Hashes for not_again_ai-0.1.3-py3-none-any.whl
Algorithm | Hash digest | |
---|---|---|
SHA256 | b7103613cb3f9cd5ebe1a3be30e9823366e483da2238d5a421aec6e434ed9dcf |
|
MD5 | 573cfe79ee866575b7dd3f3677c8726e |
|
BLAKE2b-256 | 6c351258341ec5f10854de67fd0037b37306a23c7a1ef5b79348b51fb65bdb5a |