A minimal, functional Python ETL library for reading, validating, and transforming data using YAML schemas
Project description
Aptoro
Aptoro is a Xavante word for "preparing the arrows for hunting".
It is a minimal, functional Python ETL library for reading, validating, and transforming data using YAML schemas. Designed for simplicity and correctness, it bridges the gap between raw data files (CSV, JSON) and typed, validated Python objects.
Features
- Schema-First: Define your data model in simple, readable YAML.
- Strict Validation: Ensures data quality with type checks, constraints, and range validation.
- Rich Types: Built-in support for
datetime(ISO 8601),url,file, and standard primitives. - Functional API: Pure functions and immutable dataclasses make pipelines predictable.
- Zero Boilerplate: No complex class definitions—just load your schema and go.
Installation
pip install aptoro
CLI Usage
Aptoro provides a command-line interface for validating data files directly.
# Validate a CSV file against a schema
aptoro validate data.csv --schema schema.yaml
# Explicitly specify format
aptoro validate data.txt --schema schema.yaml --format json
Quick Start
from aptoro import load, load_schema, read, validate, to_json
# All-in-one: read + validate
entries = load(source="data.csv", schema="schema.yaml")
# Or step by step pipeline:
schema = load_schema("schema.yaml")
data = read("data.csv")
entries = validate(data, schema)
# Export to JSON
json_str = to_json(entries)
# Export with embedded metadata (self-describing files)
json_meta = to_json(entries, schema=schema, include_meta=True)
Documentation
For full details on the schema language, advanced validation, and API reference, see the Documentation.
Schema Language
Define your data schema in YAML:
name: lexicon_entry
description: Dictionary entries
fields:
id: str
lemma: str
pos: str[noun|verb|adj|adv] # Constrained values (Enum)
definition: str
translation: str? # Optional field
examples: list[str]? # Optional list
frequency: int = 0 # Default value
created_at: datetime? # Optional ISO 8601 datetime
source_url: url? # Optional URL
Type Syntax
- Basic types:
str,int,float,bool - Specialized types:
url,file,datetime - Optional:
str?,int?,url?,datetime? - Default value:
str = "default",int = 0,datetime = "2024-01-01" - Constrained:
str[a|b|c] - Lists:
list[str],list[int]
See DOCS.md for full syntax, including inheritance and nested structures.
Supported Formats
- CSV (auto-detects types)
- JSON
- YAML
- TOML
License
GNU General Public License v3 (GPLv3)
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file aptoro-0.3.1.tar.gz.
File metadata
- Download URL: aptoro-0.3.1.tar.gz
- Upload date:
- Size: 45.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
91443f773d8c67e0582f035d03ee728e498a82ae7f864a29062b2b53941552dd
|
|
| MD5 |
2e5d18603b4a2f5bf78d85d1cc8f4740
|
|
| BLAKE2b-256 |
9dd48c148db929bf4b5cc5d9aa9d6edc2528ecf05e88e3c1fe101c62362b51bd
|
File details
Details for the file aptoro-0.3.1-py3-none-any.whl.
File metadata
- Download URL: aptoro-0.3.1-py3-none-any.whl
- Upload date:
- Size: 37.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.2.0 CPython/3.14.2
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
e4f67d859643b55a500ea8bd94c20af5c83d3f1d1dba1c97975b133b1f3330bd
|
|
| MD5 |
737bfc15e4cfda9b29e52a21d82d869b
|
|
| BLAKE2b-256 |
d42b40505d1158f39b6b8528b004a3e7acafc53b325767fdd7b705c3a8da00aa
|