Python package for validating, processing and parsing directories.
Project description
katachi
Katachi is a Python package for validating, processing, and parsing directory structures against defined schemas.
Note: Katachi is currently under active development and should be considered a work in progress. APIs may change in future releases.
- GitHub repository: https://github.com/nmicovic/katachi/
- Documentation: https://nmicovic.github.io/katachi/
Features
- 📐 Schema-based validation - Define expected directory structures using YAML
- 🧩 Extensible architecture - Create custom validators and actions
- 🔄 Relationship validation - Validate relationships between files (like paired files)
- 🚀 Command-line interface - Easy to use CLI with rich formatting
- 📋 Detailed reports - Get comprehensive validation reports
Installation
Install from PyPI:
pip install katachi
For development:
git clone https://github.com/nmicovic/katachi.git
cd katachi
make install
Quick Start
Define a schema (schema.yaml)
semantical_name: data
type: directory
pattern_name: data
children:
- semantical_name: image
pattern_name: "img\\d+"
type: file
extension: .jpg
description: "Image files with numeric identifiers"
- semantical_name: metadata
pattern_name: "img\\d+"
type: file
extension: .json
description: "Metadata for image files"
- semantical_name: file_pairs_check
type: predicate
predicate_type: pair_comparison
description: "Check if images have matching metadata files"
elements:
- image
- metadata
Validate a directory structure
katachi validate schema.yaml target_directory
Command-Line Examples
Validate a simple directory structure:
katachi validate "tests/schema_tests/test_sanity/schema.yaml" "tests/schema_tests/test_sanity/dataset"
Validate a nested directory structure:
katachi validate "tests/schema_tests/test_depth_1/schema.yaml" "tests/schema_tests/test_depth_1/dataset"
Validate paired files (e.g., ensure each .jpg has a matching .json file):
katachi validate "tests/schema_tests/test_paired_files/schema.yaml" "tests/schema_tests/test_paired_files/data"
Python API
from pathlib import Path
from katachi.schema.importer import load_yaml
from katachi.schema.validate import validate_schema
# Load schema from YAML
schema = load_yaml(Path("schema.yaml"), Path("data_directory"))
# Validate directory against schema
report = validate_schema(schema, Path("data_directory"))
# Check if validation passed
if report.is_valid():
print("Validation successful!")
else:
print("Validation failed with the following issues:")
for result in report.results:
if not result.is_valid:
print(f"- {result.path}: {result.message}")
Extending Katachi
Custom validators
from pathlib import Path
from katachi.schema.schema_node import SchemaNode
from katachi.validation.core import ValidationResult, ValidatorRegistry
def my_custom_validator(node: SchemaNode, path: Path) -> ValidationResult:
# Custom validation logic
return ValidationResult(
is_valid=True,
message="Custom validation passed",
path=path,
validator_name="custom_validator"
)
# Register the validator
ValidatorRegistry.register("custom_validator", my_custom_validator)
Custom file processing
from pathlib import Path
from typing import Any
from katachi.schema.actions import register_action, NodeContext
def process_image(node, path: Path, parent_contexts: list[NodeContext], context: dict[str, Any]) -> None:
# Custom image processing logic
print(f"Processing image: {path}")
# Access parent context if needed
for parent_node, parent_path in parent_contexts:
if parent_node.semantical_name == "timestamp":
print(f"Image from date: {parent_path.name}")
break
# Register the action
register_action("image", process_image)
Contributing
Contributions are welcome! See CONTRIBUTING.md for details.
License
This project is licensed under the terms of the MIT License.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file katachi-0.0.5a0.tar.gz.
File metadata
- Download URL: katachi-0.0.5a0.tar.gz
- Upload date:
- Size: 3.4 MB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
260bc71072129840e2281811a6b8f4ee30b11492ed23459634d0cb714be83adc
|
|
| MD5 |
aabff4a6f88a8c3c2e80b3d14745f884
|
|
| BLAKE2b-256 |
4d59920327530032a9513101eea48f9fb32b8fbb7d2126a43b96da9413fe7942
|
File details
Details for the file katachi-0.0.5a0-py3-none-any.whl.
File metadata
- Download URL: katachi-0.0.5a0-py3-none-any.whl
- Upload date:
- Size: 18.1 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: uv/0.6.14
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
6450e3be158a47c13aa121880658cb2d8bf3b18cc5934b9b092be9ce0741e0e7
|
|
| MD5 |
2f57d8c309abd646b43bda9b9eb36368
|
|
| BLAKE2b-256 |
4ca15f7bfa7c6331fc0e3cbb5e52cd49276eac446d0d4fdb401b86e97df0edbd
|