Wrapper components for the Sayou Data Platform
Project description
sayou-wrapper
sayou-wrapper is a schema normalization library that acts as the "Gatekeeper" of the Sayou Data Platform. It transforms heterogeneous data from various sources (Chunks, API responses, CSVs) into the unified Sayou Standard Schema.
This library ensures that downstream components like sayou-assembler do not need to understand the complexity of external data formats. If data passes through the Wrapper, it is guaranteed to be a valid SayouNode.
Philosophy
"Polymorphic Input, Monomorphic Output."
Regardless of whether the input is a Markdown chunk, a Subway API JSON, or a Stock price list, the output must always be a list of standardized SayouNode objects. This abstraction allows the Knowledge Graph builder to remain agnostic to the data source.
🚀 Key Features
- Standard Schema (
SayouNode): Defines the universal atom of the platform withnode_id,node_class,attributes, andrelationships. - Dynamic Adapter Pattern: Uses a registry-based pipeline to automatically route inputs to the correct adapter (e.g.,
DocumentChunkAdapter). - Built-in Validation: Leverages Pydantic to strictly validate data integrity upon creation.
- Semantic Mapping: Automatically converts
sayou-chunkingmetadata (e.g.,semantic_type) into Ontology classes (e.g.,sayou:Table).
📦 Installation
pip install sayou-wrapper
⚡ Quickstart
The WrapperPipeline orchestrates the adaptation process. You don't need to import specific adapters; just specify the adapter_type.
import os
import json
from sayou.wrapper.pipeline import WrapperPipeline
def run_demo():
# 1. Load Input (Output from sayou-chunking)
input_file = "chunks_output.json"
# 2. Initialize Pipeline
# Use 'document_chunk' adapter to process chunking results
pipeline = WrapperPipeline(adapter_type="document_chunk")
pipeline.initialize()
# 3. Run Transformation
# The pipeline automatically loads the JSON file
result = pipeline.run(input_file)
# 4. Check Standard Nodes
for node in result['nodes'][:2]:
print(f"[ID] {node['node_id']}")
print(f"[Class] {node['node_class']}")
print(f"[Attrs] {node['attributes']}")
print("-" * 20)
if __name__ == "__main__":
run_demo()
🤝 Contributing
To support a new data source (e.g., Notion API), simply implement a new BaseAdapter plugin and register it with the pipeline.
📜 License
Apache 2.0 License © 2025 Sayouzone
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sayou_wrapper-0.1.1.tar.gz.
File metadata
- Download URL: sayou_wrapper-0.1.1.tar.gz
- Upload date:
- Size: 10.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
31548a7cd53e62c67b37b87612a7f4e7b6c5390c2e337cfc9a13754afba586db
|
|
| MD5 |
ee68e683de7dc276ab2f8e9557f60c5e
|
|
| BLAKE2b-256 |
1c1a52c2153bc28e8769acb0431678db7ad4bba6cea405292a36d0114ab90a1b
|
File details
Details for the file sayou_wrapper-0.1.1-py3-none-any.whl.
File metadata
- Download URL: sayou_wrapper-0.1.1-py3-none-any.whl
- Upload date:
- Size: 10.7 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
1849e2ab0b76d4894578b0bd4cc2a10918e66955c954845b1e2ffa8d010598d2
|
|
| MD5 |
ea46aa311166a41db32ee6174d242d15
|
|
| BLAKE2b-256 |
24d1bb34b53dcf7fd6d9b595a5422de346384e6caf8f74d38a469e67fb2e394c
|