The easiest way to validate your data streams in Python. Whether you have small JSON files or massive CSV dumps, this tool ensures your data isn't garbage.
Project description
PyData Constraints
PyData Constraints is the easiest way to validate your data streams in Python. Whether you have small JSON files or massive CSV dumps, this tool ensures your data isn't garbage.
🚀 Why PyData Constraints?
- Rule-Based: Define your rules in simple JSON/YAML files. No coding required.
- Universal: Works with JSON and CSV out of the box using efficient streaming.
- Developer Friendly: Written in pure Python with minimal dependencies.
⚡ Basic Use Case (At a glance)
Imagine you have a users.json file and you want to ensure all emails are valid.
1. Your Data (users.json):
[
{ "id": 1, "email": "alice@example.com" },
{ "id": 2, "email": "bob-has-no-domain" }
]
2. Your Rules (config.json):
{
"sources": [
{
"service": "users",
"type": "file",
"path": "users.json",
"format": "json"
}
],
"constraints": [
{
"type": "format",
"id": "valid-email",
"service": "users",
"field": "email",
"regex": "^[^\\s@]+@[^\\s@]+\\.[^\\s@]+$",
"message": "Invalid email: {{email}}"
}
]
}
3. Run and get results:
$ data-constraints validate --config config.json
[INFO] Validating data...
[valid-email] (format) Invalid email: bob-has-no-domain
Validation finished. Found 1 issues.
📦 Installation
To install via pip:
pip install pydata-constraints
This installs both the python package pydata_constraints and the CLI command data-constraints.
🧠 Basic Concepts
PyData Constraints works with three core files:
- Data Files: Your actual data dumps in
.jsonor.csvformat. - Config File: A JSON/YAML file pointing the engine to your data and rules.
- Constraints (Rules): The definitions of what is valid.
Rule Types at a Glance
- 📝 Format: Ensure strings look correct (e.g. Emails).
- Example:
"regex": "^[^\\s@]+@[^\\s@]+\\.[^\\s@]+$"
- Example:
- 🆔 Unique: Ensure no duplicate IDs exist across a file.
- Example:
"field": "employee_id"
- Example:
- 🔗 Foreign Key: Ensure referenced IDs actually exist in another file.
- Example:
order.userIdmust exist inusers.id
- Example:
📚 Documentation
For full documentation, guides and advanced use cases, please check the docs/ directory.
- Key Concepts: Easy-to-understand explanation of file types and constraints.
- User Guide: The comprehensive guide to using the CLI and defining rules.
- Integration Guide: How to integrate the engine programmatically in Python.
- Examples: Runnable examples, ranging from simple to e-commerce.
🛠️ Features
| Feature | Description |
|---|---|
| Format Validation | Regex-based validation for strings (Emails, Phones, Codes). |
| Unique Validation | Ensure IDs and codes are unique across your dataset. |
| Foreign Keys | Validate relationships between different files (e.g. order.userId -> user.id). |
| Multiple Reporters | Output results to Console, JSON, or Markdown files. |
🤝 Contributing
In PyData Constraints contributions, bug reports, and feature requests are welcome. If you have ideas, just launch your PRs!
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file pydata_constraints-1.0.1.tar.gz.
File metadata
- Download URL: pydata_constraints-1.0.1.tar.gz
- Upload date:
- Size: 16.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2462ed588cad25ff378167bda80a22b2e2a9f9a1b12286473555f68aadbbb440
|
|
| MD5 |
86cd3a3ad78179e9372b490d52bba76b
|
|
| BLAKE2b-256 |
5fe7a634be47dac5d2135191fbe0a2e424789bb454cad80f5a9b41533d9a46c1
|
File details
Details for the file pydata_constraints-1.0.1-py3-none-any.whl.
File metadata
- Download URL: pydata_constraints-1.0.1-py3-none-any.whl
- Upload date:
- Size: 23.4 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.12.3
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
2619dca9e80c8a9f3e3e4b1e57ddae5d63464142d725e61dd7d9a017925937c1
|
|
| MD5 |
71c61aab107ee552bf60018f40bf68de
|
|
| BLAKE2b-256 |
63f544f5633755eecf920484ba9a1790d7eacfeb48ff0f0cc334e966a4e21418
|