Skip to main content

Awesome `np-validator` is a Python cli/package created with https://github.com/TezRomacH/python-package-template

Project description

np-validator

Build status Python Version Dependencies Status

Code style: black Security: bandit Pre-commit Semantic Versions License Coverage Report

np-validator is just a simple Python cli/package that validates data sources for the neuropixel pipline using templated workflows.

Quick overview

API

Auto-generated steps

By default run_validation can be run on a list of filepaths. A list of ValidationStep will be auto-generated if a valid project name is supplied. If no project name is supplied project name will be "default".

from np_validator import run_validation

filepaths = [
  "some/sort/of/path/prefix/uuid-maybe.mapping.pkl",
  "some/sort/of/path/prefix/uuid-maybe.behavior.pkl",
  "this/will/be/ignored/uuid-maybe.replay.pkl",
]

results = run_validation(filepaths, project="default")

Manually generating validation steps

from np_validator import Processor, Validator, ValidationStep, run_validation

# make a basic filesize validator
fs_validator = Validator(
  name="meets_filesize_threshold",
  args={
    "threshold": 10,
  },
)

# add validator to a validation step
validation_step_0 = ValidationStep(
  path_suffix=".mapping.pkl",
  validators=[fs_validator, ],
)

# make a validation step with a processor
# processors convert the data source from it's basic state, a filepath, to an easier to use object

# unpickle, unpickles an arbitrary filepath into a python object
unpickler = Processor(
  name="unpickle",
)

# has_dict_key, checks if a dict-like interface has a key at path
session_uuid_validator = Validator(
  name="has_dict_key",
  args={
    "path": ["session_uuid", ],
  }
)

# assembling it all together as a validation step

validation_step_1 = ValidationStep(
  path_suffix=".behavior.pkl",
  processor=unpickler,
  validators=[session_uuid_validator, ],
)

# running a validation
filepaths = [
  "some/sort/of/path/prefix/uuid-maybe.mapping.pkl",
  "some/sort/of/path/prefix/uuid-maybe.behavior.pkl",
  "this/will/be/ignored/uuid-maybe.replay.pkl",
]
results = run_validation(
  filepaths,
  [validation_step_0, validation_step_1, ],
)

Updating autogenerated steps

from np_validator import Processor, Validator, ValidationStep, autogenerate_validation_steps, update_project_validation_steps

# autogenerated steps are organized by projects
validation_step_0 = ValidationStep(
  path_suffix=".mapping.pkl",
  validators=[
    Validator(
      name="meets_filesize_threshold",
      args={
        "threshold": 10,
      },
    ),
  ],
)

validation_step_1 = ValidationStep(
  path_suffix=".behavior.pkl",
  processor=Processor(
    name="unpickle",
  ),
  validators=[
    Validator(
      name="has_dict_key",
      args={
        "path": ["session_uuid", ],
      }
    ),
  ],
)

# updating project validation steps overwrites the previous steps

update_project_validation_steps(
  "pretest",  # project name
  [validation_step_0, validation_step_1],
)

# to append, get the current steps with autogenerate_validation_steps and append to it
current = autogenerate_validation_steps("default")
updated = current + [validation_step_0, validation_step_1]
update_project_validation_steps(
  "default",  # project name
  updated, # append new steps
)

Documentation

For more detailed documentation on using this package please refer to the docs.

Contributing

Processor

To add a new processor, add the function to np_validator/processors.py. Ideally create a test for it in tests. Run the tests* to help ensure that no regressions have been introduced. Each processor is expected to have one required argument which is expected to be a string filepath.

Validator

To add a new validator, add the function to np_validator/validators.py. Ideally create a test for it in tests. Run the tests* to help ensure that no regressions have been introduced.

*Tests automatically run on pull_request.

Installation

pip install -U np-validator

or install with Poetry

poetry add np-validator

Makefile usage

Makefile contains a lot of functions for faster development.

1. Download and remove Poetry

To download and install Poetry run:

make poetry-download

To uninstall

make poetry-remove

2. Install all dependencies and pre-commit hooks

Install requirements:

make install

Pre-commit hooks coulb be installed after git init via

make pre-commit-install

3. Codestyle

Automatic formatting uses pyupgrade, isort and black.

make codestyle

# or use synonym
make formatting

Codestyle checks only, without rewriting files:

make check-codestyle

Note: check-codestyle uses isort, black and darglint library

Update all dev libraries to the latest version using one comand

make update-dev-deps

4. Code security

make check-safety

This command launches Poetry integrity checks as well as identifies security issues with Safety and Bandit.

make check-safety

5. Type checks

Run mypy static type checker

make mypy

6. Tests with coverage badges

Run pytest

make test

7. All linters

Of course there is a command to rule run all linters in one:

make lint

the same as:

make test && make check-codestyle && make mypy && make check-safety

8. Docker

make docker-build

which is equivalent to:

make docker-build VERSION=latest

Remove docker image with

make docker-remove

More information about docker.

9. Cleanup

Delete pycache files

make pycache-remove

Remove package build

make build-remove

Delete .DS_STORE files

make dsstore-remove

Remove .mypycache

make mypycache-remove

Or to remove all above run:

make cleanup

📈 Releases

You can see the list of available releases on the GitHub Releases page.

We follow Semantic Versions specification.

We use Release Drafter. As pull requests are merged, a draft release is kept up-to-date listing the changes, ready to publish when you’re ready. With the categories option, you can categorize pull requests in release notes using labels.

List of labels and corresponding titles

Label Title in Releases
enhancement, feature 🚀 Features
bug, refactoring, bugfix, fix 🔧 Fixes & Refactoring
build, ci, testing 📦 Build System & CI/CD
breaking 💥 Breaking Changes
documentation 📝 Documentation
dependencies ⬆️ Dependencies updates

You can update it in release-drafter.yml.

GitHub creates the bug, enhancement, and documentation labels for you. Dependabot creates the dependencies label. Create the remaining labels on the Issues tab of your GitHub repository, when you need them.

🛡 License

License

This project is licensed under the terms of the MIT license. See LICENSE for more details.

Additonal information

Initializing a new repo

Initialize your code

  1. Initialize git inside your repo:
cd np-validator && git init
  1. If you don't have Poetry installed run:
make poetry-download
  1. Initialize poetry and install pre-commit hooks:
make install
make pre-commit-install
  1. Run the codestyle:
make codestyle
  1. Upload initial code to GitHub:
git add .
git commit -m ":tada: Initial commit"
git branch -M main
git remote add origin https://github.com/np_validator/np-validator.git
git push -u origin main

Set up bots

  • Set up Dependabot to ensure you have the latest dependencies.
  • Set up Stale bot for automatic issue closing.

Poetry

Want to know more about Poetry? Check its documentation.

Details about Poetry

Poetry's commands are very intuitive and easy to learn, like:

  • poetry add numpy@latest
  • poetry run pytest
  • poetry publish --build

etc

Building and releasing your package

Building a new version of the application contains steps:

  • Bump the version of your package poetry version <version>. You can pass the new version explicitly, or a rule such as major, minor, or patch. For more details, refer to the Semantic Versions standard.
  • Make a commit to GitHub.
  • Create a GitHub release.
  • And... publish 🙂 poetry publish --build

🎯 What's next

Well, that's up to you 💪🏻. I can only recommend the packages and articles that helped me.

  • Typer is great for creating CLI applications.
  • Rich makes it easy to add beautiful formatting in the terminal.
  • Pydantic – data validation and settings management using Python type hinting.
  • Loguru makes logging (stupidly) simple.
  • tqdm – fast, extensible progress bar for Python and CLI.
  • IceCream is a little library for sweet and creamy debugging.
  • orjson – ultra fast JSON parsing library.
  • Returns makes you function's output meaningful, typed, and safe!
  • Hydra is a framework for elegantly configuring complex applications.
  • FastAPI is a type-driven asynchronous web framework.

Articles:

🚀 Features

Development features

Deployment features

Open source community features

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

np-validator-0.10.0.tar.gz (25.8 kB view details)

Uploaded Source

Built Distribution

np_validator-0.10.0-py3-none-any.whl (21.1 kB view details)

Uploaded Python 3

File details

Details for the file np-validator-0.10.0.tar.gz.

File metadata

  • Download URL: np-validator-0.10.0.tar.gz
  • Upload date:
  • Size: 25.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.9.13 Darwin/20.1.0

File hashes

Hashes for np-validator-0.10.0.tar.gz
Algorithm Hash digest
SHA256 d0e8d70ed6228816e0b0895a3551c6dca90ed89f4a0dbdb108a4c9009df6b797
MD5 a01452df67dc741235faecbbfe945005
BLAKE2b-256 71f5734ebf548d8ca1e6ac93e14a293ae9cf4baf0d2aff40d083f8c202d21990

See more details on using hashes here.

File details

Details for the file np_validator-0.10.0-py3-none-any.whl.

File metadata

  • Download URL: np_validator-0.10.0-py3-none-any.whl
  • Upload date:
  • Size: 21.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.1.14 CPython/3.9.13 Darwin/20.1.0

File hashes

Hashes for np_validator-0.10.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b8525fb6a481ce103f06b9b1d1396677f4ddf693a1597f6dc81801e056f2d21e
MD5 de3839cb6eda35230317ff6503c30988
BLAKE2b-256 bcd017578ebe81deac567baa32769425701c4d43746a6bb9a6ee4303647e7a6a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page