Real-time data quality screening API — PASS / WARN / BLOCK in under 10ms

These details have not been verified by PyPI

Project links

Project description

DataScreenIQ Python SDK

Most data pipelines don’t fail — they silently corrupt production data, break dashboards, and go unnoticed for days.

DataScreenIQ acts as a gate before your database, detecting schema drift, missing values, and type mismatches in real time.

Real-time data quality screening at the edge. Screen any data payload and get PASS / WARN / BLOCK in milli seconds .

import datascreeniq as dsiq

client = dsiq.Client("dsiq_live_...")
report = client.screen(rows, source="orders")

print(report.status)       # BLOCK
print(report.health_pct)   # 34.0%
print(report.issues)       # {"type_mismatches": ["amount"], "null_rates": {"email": 0.5}}

Installation

pip install datascreeniq

With pandas support:

pip install datascreeniq[pandas]

With Excel support:

pip install datascreeniq[excel]

Everything:

pip install datascreeniq[all]

Quick start

Get a free API key at datascreeniq.com — 500K rows/month free.

import datascreeniq as dsiq

client = dsiq.Client("dsiq_live_...")

rows = [
    {"order_id": "ORD-001", "amount": 99.50,    "email": "alice@corp.com"},
    {"order_id": "ORD-002", "amount": "broken", "email": None},
    {"order_id": "ORD-003", "amount": 75.00,    "email": None},
]

report = client.screen(rows, source="orders")

print(report.status)        # BLOCK
print(report.health_pct)    # 34.0%
print(report.type_mismatches)  # ["amount"]
print(report.null_rates)       # {"email": 0.5}
print(report.summary())
# 🚨 BLOCK | Health: 34.0% | Rows: 3 | Type mismatches: amount | Null rate: email=50% | (9ms)

API key

Set as environment variable (recommended):

export DATASCREENIQ_API_KEY="dsiq_live_..."

client = dsiq.Client()  # reads from env automatically

Or pass directly:

client = dsiq.Client("dsiq_live_...")

Usage

Screen a list of dicts

report = client.screen(rows, source="orders")

Screen a CSV file

report = client.screen_file("orders.csv", source="orders")

Screen an Excel file

# pip install datascreeniq[excel]
report = client.screen_file("orders.xlsx", source="orders", sheet=0)

Screen a pandas DataFrame

# pip install datascreeniq[pandas]
import pandas as pd

df = pd.read_csv("orders.csv")
report = client.screen_dataframe(df, source="orders")

Screen a JSON or XML file

report = client.screen_file("orders.json", source="orders")
report = client.screen_file("orders.xml",  source="orders")

The ScreenReport object

report.status           # "PASS" | "WARN" | "BLOCK"
report.health_score     # float 0.0 – 1.0
report.health_pct       # "94.5%"

report.is_pass          # True / False
report.is_warn          # True / False
report.is_blocked       # True / False

report.issues           # full issues dict
report.type_mismatches  # ["amount", "price"]
report.null_rates       # {"email": 0.50}
report.outlier_fields   # ["amount"]

report.drift            # list of drift events
report.drift_count      # int
report.has_drift        # True / False

report.rows_received    # int
report.rows_sampled     # int
report.latency_ms       # int
report.batch_id         # str
report.timestamp        # ISO string

report.summary()        # human-readable one-liner
report.to_dict()        # raw API response

Pipeline integration

Raise on block

from datascreeniq.exceptions import DataQualityError

try:
    client.screen(rows, source="orders").raise_on_block()
    # only reaches here if PASS or WARN
    load_to_warehouse(rows)

except DataQualityError as e:
    print(f"Blocked: {e}")
    print(f"Issues:  {e.report.issues}")
    send_to_dead_letter_queue(rows)

Airflow task

from airflow.decorators import task
import datascreeniq as dsiq

@task
def quality_gate(rows: list, source: str) -> dict:
    client = dsiq.Client()   # reads DATASCREENIQ_API_KEY from env
    report = client.screen(rows, source=source)
    if report.is_blocked:
        raise ValueError(f"Data blocked: {report.summary()}")
    return report.to_dict()

Prefect flow

from prefect import flow, task
import datascreeniq as dsiq

@task
def screen_data(rows, source):
    return dsiq.Client().screen(rows, source=source).raise_on_block()

@flow
def my_pipeline():
    rows = extract_from_source()
    screen_data(rows, source="orders")   # blocks flow if quality fails
    load_to_warehouse(rows)

dbt post-hook

import pandas as pd
import datascreeniq as dsiq

def screen_dbt_model(model_name: str, conn):
    df = pd.read_sql(f"SELECT * FROM {model_name} LIMIT 10000", conn)
    return dsiq.Client().screen_dataframe(df, source=model_name).raise_on_block()

Large files — auto chunking

Files with more than 10,000 rows are automatically split into chunks and screened in parallel. Results are merged into a single report:

# 1M row file — 100 API calls, one merged report
report = client.screen_file("events.csv", source="events")
print(f"Screened {report.rows_received:,} rows")

Error handling

from datascreeniq.exceptions import (
    AuthenticationError,   # invalid API key
    PlanLimitError,        # monthly row limit exceeded
    RateLimitError,        # too many requests
    ValidationError,       # bad payload
    APIError,              # server error
    DataQualityError,      # raised by .raise_on_block()
)

try:
    report = client.screen(rows, source="orders")
except AuthenticationError:
    print("Check your API key")
except PlanLimitError:
    print("Monthly limit reached — upgrade at datascreeniq.com")
except PlanLimitError as e:
    print(f"Rate limited: {e}")

Why DataScreenIQ exists

• Dashboards break AFTER bad data is already stored • Data tests are usually batch-based and too late • Silent corruption is the most expensive failure in data systems

DataScreenIQ moves validation to the edge — before storage, before transformation, before damage.

Why thrust this

Built for production workloads: • Handles 1M+ rows via auto-chunking • Parallel validation engine • Sub-second latency decisions

Pricing

Plan	Price	Rows / month
Developer	Free	500K
Starter	$19/mo	5M
Growth	$79/mo	50M
Scale	$199/mo	500M

Get your free API key →

Requirements

Python 3.8+
requests (auto-installed)
pandas — optional, for screen_dataframe()
openpyxl — optional, for Excel files

License

MIT

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

1.0.12

Apr 5, 2026

1.0.11

Apr 2, 2026

1.0.10

Apr 2, 2026

1.0.9

Apr 2, 2026

1.0.8

Apr 2, 2026

1.0.7

Apr 1, 2026

1.0.6

Apr 1, 2026

1.0.5

Mar 31, 2026

1.0.4

Mar 31, 2026

This version

1.0.3

Mar 31, 2026

1.0.2

Mar 30, 2026

1.0.1

Mar 30, 2026

1.0.0

Mar 30, 2026

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datascreeniq-1.0.3.tar.gz (16.0 kB view details)

Uploaded Mar 31, 2026 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

datascreeniq-1.0.3-py3-none-any.whl (12.9 kB view details)

Uploaded Mar 31, 2026 Python 3

File details

Details for the file datascreeniq-1.0.3.tar.gz.

File metadata

Download URL: datascreeniq-1.0.3.tar.gz
Upload date: Mar 31, 2026
Size: 16.0 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for datascreeniq-1.0.3.tar.gz
Algorithm	Hash digest
SHA256	`445421da4d14603a91f9c372fb8fb716edcee53222cf0d5ce601fb9f0bab57c7`
MD5	`04e245210d75e66bafbe9247d72acb47`
BLAKE2b-256	`82e26851fceed94d963f98b4ef00a1b4b755fd9052cc5721dbdf53f7199c0254`

See more details on using hashes here.

File details

Details for the file datascreeniq-1.0.3-py3-none-any.whl.

File metadata

Download URL: datascreeniq-1.0.3-py3-none-any.whl
Upload date: Mar 31, 2026
Size: 12.9 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.14.3

File hashes

Hashes for datascreeniq-1.0.3-py3-none-any.whl
Algorithm	Hash digest
SHA256	`2a4bc6a1313888006ae08d34080d7f8ba418feb286eb2210eab9c035c1e3d5ca`
MD5	`e13e9d7b1d05d3709815f7610eb234dc`
BLAKE2b-256	`9bf27acdbd8f4f3ae1cb0247c401c1897822ba76ba7d518a25741bf7e068f0c4`

See more details on using hashes here.

datascreeniq 1.0.3

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

DataScreenIQ Python SDK

Installation

Quick start

API key

Usage

Screen a list of dicts

Screen a CSV file

Screen an Excel file

Screen a pandas DataFrame

Screen a JSON or XML file

The ScreenReport object

Pipeline integration

Raise on block

Airflow task

Prefect flow

dbt post-hook

Large files — auto chunking

Error handling

Why DataScreenIQ exists

Why thrust this

Pricing

Requirements

Links

License

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes