Skip to main content

Wrapper components for the Sayou Data Platform

Project description

sayou-wrapper

Build Status License: Apache 2.0 Docs

sayou-wrapper is a schema normalization library that acts as the "Gatekeeper" of the Sayou Data Platform. It transforms heterogeneous data from various sources (Chunks, API responses, CSVs) into the unified Sayou Standard Schema.

This library ensures that downstream components like sayou-assembler do not need to understand the complexity of external data formats. If data passes through the Wrapper, it is guaranteed to be a valid SayouNode.

Philosophy

"Polymorphic Input, Monomorphic Output." Regardless of whether the input is a Markdown chunk, a Subway API JSON, or a Stock price list, the output must always be a list of standardized SayouNode objects. This abstraction allows the Knowledge Graph builder to remain agnostic to the data source.

🚀 Key Features

  • Standard Schema (SayouNode): Defines the universal atom of the platform with node_id, node_class, attributes, and relationships.
  • Dynamic Adapter Pattern: Uses a registry-based pipeline to automatically route inputs to the correct adapter (e.g., DocumentChunkAdapter).
  • Built-in Validation: Leverages Pydantic to strictly validate data integrity upon creation.
  • Semantic Mapping: Automatically converts sayou-chunking metadata (e.g., semantic_type) into Ontology classes (e.g., sayou:Table).

📦 Installation

pip install sayou-wrapper

⚡ Quickstart

The WrapperPipeline orchestrates the adaptation process. You don't need to import specific adapters; just specify the adapter_type.

import os
import json
from sayou.wrapper.pipeline import WrapperPipeline

def run_demo():
    # 1. Load Input (Output from sayou-chunking)
    input_file = "chunks_output.json"
    
    # 2. Initialize Pipeline
    # Use 'document_chunk' adapter to process chunking results
    pipeline = WrapperPipeline(adapter_type="document_chunk")
    pipeline.initialize()

    # 3. Run Transformation
    # The pipeline automatically loads the JSON file
    result = pipeline.run(input_file)
    
    # 4. Check Standard Nodes
    for node in result['nodes'][:2]:
        print(f"[ID] {node['node_id']}")
        print(f"[Class] {node['node_class']}")
        print(f"[Attrs] {node['attributes']}")
        print("-" * 20)

if __name__ == "__main__":
    run_demo()

🤝 Contributing

To support a new data source (e.g., Notion API), simply implement a new BaseAdapter plugin and register it with the pipeline.

📜 License

Apache 2.0 License © 2025 Sayouzone

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sayou_wrapper-0.1.1.tar.gz (10.8 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sayou_wrapper-0.1.1-py3-none-any.whl (10.7 kB view details)

Uploaded Python 3

File details

Details for the file sayou_wrapper-0.1.1.tar.gz.

File metadata

  • Download URL: sayou_wrapper-0.1.1.tar.gz
  • Upload date:
  • Size: 10.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.1.1.tar.gz
Algorithm Hash digest
SHA256 31548a7cd53e62c67b37b87612a7f4e7b6c5390c2e337cfc9a13754afba586db
MD5 ee68e683de7dc276ab2f8e9557f60c5e
BLAKE2b-256 1c1a52c2153bc28e8769acb0431678db7ad4bba6cea405292a36d0114ab90a1b

See more details on using hashes here.

File details

Details for the file sayou_wrapper-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: sayou_wrapper-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 10.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 1849e2ab0b76d4894578b0bd4cc2a10918e66955c954845b1e2ffa8d010598d2
MD5 ea46aa311166a41db32ee6174d242d15
BLAKE2b-256 24d1bb34b53dcf7fd6d9b595a5422de346384e6caf8f74d38a469e67fb2e394c

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page