Skip to main content

🐼 Patrol your data tests

Project description

Panda Patrol Panda Patrol

License Python

Add dashboards, alerting, and silencing to your data tests with < 10 lines of code.

Questions and feedback

Email: ivanzhangofficial@gmail.com

Call: https://calendly.com/aivanzhang/chat

Overview

Wrap your existing data tests to automatically generate dashboards, alerting, and silencing. Currently this library does not deal with the orchestration of these data tests. However this may be added in the future depending on demand.

Quickstart

1) Installation

Install the latest version of panda-patrol using pip:

pip install panda-patrol

2) Setup the environment variables

In an existing or new .env file, set the following environment variables:

PANDA_PATROL_URL
PANDA_PATROL_ENV
PANDA_DATABASE_URL
SMTP_SERVER
SMTP_PORT
SMTP_USER
SMTP_PASS
PATROL_EMAIL

See .env.example for more information about how to set these environment variables. See Environment Variables for more information about each environment variable.

3) Start the panda-patrol server. This will spin up a website at PANDA_PATROL_URL.

python -m panda_patrol

4) Wrap your existing data tests

Spin up a new data test dashboard by wrapping your existing data tests with patrol_group and @patrol. The following example shows how to wrap a data test in a dagster pipeline. However, you can use whatever Python-based data pipeline.

Before (hello-dagster.py from https://docs.dagster.io/getting-started/hello-dagster):

def hackernews_top_stories(context: AssetExecutionContext):
    """Get items based on story ids from the HackerNews items endpoint."""
    with open("hackernews_top_story_ids.json", "r") as f:
        hackernews_top_story_ids = json.load(f)

    results = []
	# Get information about each item including the url
    for item_id in hackernews_top_story_ids:
        item = requests.get(
            f"https://hacker-news.firebaseio.com/v0/item/{item_id}.json"
        ).json()
        results.append(item)

        # DATA TEST: Make sure that the item's URL is a valid URL
        for item in results:
		print(item["url"])
		get_item_response = requests.get(item["url"])
		assert get_item_response.status_code == 200
    ...

After:

+ from panda_patrol.patrols import patrol_group
...
def hackernews_top_stories(context: AssetExecutionContext):
    """Get items based on story ids from the HackerNews items endpoint."""
    with open("hackernews_top_story_ids.json", "r") as f:
        hackernews_top_story_ids = json.load(f)

    results = []
	# Get information about each item including the url
    for item_id in hackernews_top_story_ids:
        item = requests.get(
            f"https://hacker-news.firebaseio.com/v0/item/{item_id}.json"
        ).json()
        results.append(item)

    # DATA TEST: Make sure that the item's URL is a valid URL
+   with patrol_group("Hackernews Items are Valid") as patrol:
+	@patrol("URLs work")
+	def urls_work(patrol_id):
		"""URLs for stories should work."""
		for item in results:
			print(item["url"])
			get_item_response = requests.get(item["url"])
			assert get_item_response.status_code == 200
		
		return len(results)
    ...

5) Run your data pipeline

Run your data pipelines as you normally would. Here we use dagster to run the data tests. However, you can use whatever Python-based data pipeline.

dagster dev -f hello-dagster.py

6) View the results

Go to PANDA_PATROL_URL to view the results of your data tests. You should see something like this:

Main page Panda Patrol Dashboard Results page of a specific data test run Log

:tada: Congrats! :tada: You have created your first data test dashboard! See the documentation for more information on other features like adjustable parameters, alerting, and silencing.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

panda-patrol-0.0.7.tar.gz (2.3 MB view hashes)

Uploaded Source

Built Distribution

panda_patrol-0.0.7-py3-none-any.whl (2.4 MB view hashes)

Uploaded Python 3

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page