🐼 Patrol your data tests

These details have not been verified by PyPI

Project links

GitHub Statistics

Project description

Panda Patrol

License Python

Add dashboards, alerting, and silencing to your data tests with less than 5 lines of code.

Questions and feedback

Email: ivanzhangofficial@gmail.com

Call: https://calendly.com/aivanzhang/chat

See Airflow on how to add panda patrols into your Airflow-based data pipelines.

See Dagster on how to add panda patrols into your Dagster-based data pipelines.

Overview

Wrap your existing data tests to automatically generate dashboards, alerting, and silencing. Currently this library does not deal with the orchestration of these data tests. However this may be added in the future depending on demand.

Getting Started (Demo)

This is a short tutorial that creates a patrol around a data test and then displays this patrol on a publicly accessible dashboard here: https://panda-patrol.vercel.app/dashboard. This tutorial uses dagster to run the data tests. However, you can use whatever Python-based data pipeline.

1) Installation

Install the latest version of panda-patrol using pip:

pip install panda-patrol

2) Setup the environment variables

In an existing or new .env file, set the following environment variables:

PANDA_PATROL_URL=https://panda-patrol.vercel.app/dashboard
PANDA_PATROL_ENV=production

See .env.example for more information about how to set these and other environment variables. See Environment Variables for more information about each environment variable.

3) Wrap your existing data tests

Spin up a new data test dashboard by wrapping your existing data tests with patrol_group and @patrol. The following example shows how to wrap a data test in a dagster pipeline. However, you can use whatever Python-based data pipeline.

At a high level, you do the following:

Import patrol_group
Group several data tests with patrol_group
Wrap each individual existing data test with @patrol

from panda_patrol.patrols import patrol_group
...
with patrol_group(PATROL_GROUP_NAME) as patrol:
    @patrol(PATROL_NAME)
    def DATA_TEST_NAME(patrol_id):
        ...

Here is a more detailed example of how to wrap a data test in a dagster pipeline. Before (hello-dagster.py from https://docs.dagster.io/getting-started/hello-dagster):

def hackernews_top_stories(context: AssetExecutionContext):
    """Get items based on story ids from the HackerNews items endpoint."""
    with open("hackernews_top_story_ids.json", "r") as f:
        hackernews_top_story_ids = json.load(f)

    results = []
    # Get information about each item including the url
    for item_id in hackernews_top_story_ids:
        item = requests.get(
            f"https://hacker-news.firebaseio.com/v0/item/{item_id}.json"
        ).json()
        results.append(item)

        # DATA TEST: Make sure that the item's URL is a valid URL
        for item in results:
            print(item["url"])
            get_item_response = requests.get(item["url"])
            assert get_item_response.status_code == 200
    ...

After:

+ from panda_patrol.patrols import patrol_group
...
def hackernews_top_stories(context: AssetExecutionContext):
    """Get items based on story ids from the HackerNews items endpoint."""
    with open("hackernews_top_story_ids.json", "r") as f:
        hackernews_top_story_ids = json.load(f)

    results = []
    # Get information about each item including the url
    for item_id in hackernews_top_story_ids:
        item = requests.get(
            f"https://hacker-news.firebaseio.com/v0/item/{item_id}.json"
        ).json()
        results.append(item)

    # DATA TEST: Make sure that the item's URL is a valid URL
+   with patrol_group("Hackernews Items are Valid") as patrol:
+	@patrol("URLs work")
+	def urls_work(patrol_id):
		"""URLs for stories should work."""
		for item in results:
			print(item["url"])
			get_item_response = requests.get(item["url"])
			assert get_item_response.status_code == 200
		
		return len(results)
    ...

❗IMPORTANT
Note that each data test method (i.e. urls_work) should have only one parameter patrol_id. This parameter will be useful when defining parameters for your data tests in the Parameters.

4) Run your data pipeline

Start your data pipelines as you normally would. Then run the step in the pipeline with the test. Here we use dagster to run the data tests. However, you can use whatever Python-based data pipeline.

dagster dev -f hello-dagster.py

5) View the results

Go to https://panda-patrol.vercel.app/dashboard to view the results of your data tests. Note you may see other people's data tests on this dashboard as well. This is because this dashboard is publicly accessible.

:tada: Congrats! :tada: You have created your first data test dashboard! See the documentation for more information and Quickstart on how to spin up your own Panda Patrol server and other features like adjustable parameters, alerting, and silencing.

Project details

These details have not been verified by PyPI

Project links

GitHub Statistics

Release history Release notifications | RSS feed

0.0.102

Dec 8, 2023

0.0.101

Dec 8, 2023

0.0.100

Dec 8, 2023

0.0.99

Dec 4, 2023

0.0.98

Nov 21, 2023

0.0.97

Nov 13, 2023

0.0.96

Nov 13, 2023

0.0.95

Nov 7, 2023

0.0.94

Nov 2, 2023

0.0.93

Nov 2, 2023

0.0.92

Nov 1, 2023

0.0.91

Oct 31, 2023

0.0.90

Oct 31, 2023

0.0.89

Oct 29, 2023

0.0.88

Oct 29, 2023

0.0.87

Oct 28, 2023

0.0.86

Oct 27, 2023

0.0.85

Oct 25, 2023

0.0.84

Oct 25, 2023

0.0.83

Oct 25, 2023

0.0.82

Oct 23, 2023

0.0.81

Oct 23, 2023

0.0.80

Oct 23, 2023

0.0.79

Oct 23, 2023

This version

0.0.78

Oct 23, 2023

0.0.77

Oct 20, 2023

0.0.76

Oct 20, 2023

0.0.75

Oct 20, 2023

0.0.74

Oct 20, 2023

0.0.73

Oct 20, 2023

0.0.72

Oct 20, 2023

0.0.71

Oct 12, 2023

0.0.7

Oct 12, 2023

0.0.6

Oct 11, 2023

0.0.5

Oct 11, 2023

0.0.4

Oct 11, 2023

0.0.3

Oct 11, 2023

0.0.2

Oct 11, 2023

0.0.1

Oct 3, 2023

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

panda-patrol-0.0.78.tar.gz (2.6 MB view hashes)

Uploaded Oct 23, 2023 Source

Built Distribution

panda_patrol-0.0.78-py3-none-any.whl (2.6 MB view hashes)

Uploaded Oct 23, 2023 Python 3

Hashes for panda-patrol-0.0.78.tar.gz

Hashes for panda-patrol-0.0.78.tar.gz
Algorithm	Hash digest
SHA256	`f6311c4667c067dd83f8071ab882e70021a39003da39f187db84f7dc299f50c6`
MD5	`b5224969bd3a9b16d4d58ca83b617a49`
BLAKE2b-256	`f4bb214c64b7f770fab324022e05ec81826f395c68b4400c1381b217d11f7dbe`

Hashes for panda_patrol-0.0.78-py3-none-any.whl

Hashes for panda_patrol-0.0.78-py3-none-any.whl
Algorithm	Hash digest
SHA256	`4368b554e7eaeca8f798f7934fc81abce906d98ce574a17eb17ed53de6e0d23d`
MD5	`56464543ebc1883de525832e7eb0fb79`
BLAKE2b-256	`92ac7bd5f4eae6ff9d92d5899e3cfde49a60c95aff6affde51bdf1d0717fa9bf`