Skip to main content

Wrapper components for the Sayou Data Platform

Project description

sayou-wrapper

PyPI version License Docs

The Ontology Mapper for Sayou Fabric.

sayou-wrapper takes the fragmented chunks produced by sayou-chunking and wraps them into a standardized graph structure (SayouNode). This is the final preparation step before data is assembled into a Knowledge Graph or loaded into a Vector DB.

It applies the Sayou Ontology Schema (Namespace -> Class -> Predicate) to raw data, turning simple text into semantically rich entities.


1. Architecture & Role

The Wrapper sits between Chunking and Assembly. It transforms raw dictionaries (Chunks) into standardized Objects (Nodes) with globally unique URIs.

graph LR
    Chunks[Raw Chunks] --> Pipeline[Wrapper Pipeline]
    
    subgraph Logic
        Schema[Schema Mapping]
        URI[URI Generation]
        Attr[Attribute Norm]
    end
    
    Pipeline --> Logic
    Logic --> Nodes[SayouNodes]

1.1. Core Features

  • Ontology Enforcement: Assigns strict classes (e.g., sayou:Topic, sayou:Function) based on chunk metadata.
  • URI Normalization: Generates idempotent IDs (e.g., sayou:doc:hash_123) to prevent duplication.
  • Metadata Preservation: Carries over source coordinates (page numbers, line numbers) into node attributes.

2. Available Strategies

sayou-wrapper applies different mapping rules based on the source domain.

Strategy Key Target Domain Description
document_chunk PDF, Markdown Maps headers to sayou:Topic and text to sayou:TextFragment. Preserves hierarchy metadata.
code_chunk Python, Java Maps AST objects to sayou:Class, sayou:Function, or sayou:Module.
video_chunk YouTube, MP4 Maps timestamps to sayou:VideoSegment. Preserves start/end times.
general Plain Text [Default] Treats everything as generic sayou:Unstructured nodes.

3. Installation

pip install sayou-wrapper

4. Usage

The WrapperPipeline converts a list of dictionaries (Chunks) into a WrapperOutput object containing nodes.

Case A: Document Processing (Default)

Converts hierarchical chunks into Topic and Fragment nodes.

from sayou.wrapper import WrapperPipeline

chunks = [
    {
        "content": "# Introduction",
        "metadata": {
            "chunk_id": "h_1", 
            "semantic_type": "heading", 
            "depth": 1
        }
    },
    {
        "content": "Sayou is great.",
        "metadata": {
            "chunk_id": "p_1", 
            "parent_id": "h_1",
            "semantic_type": "text"
        }
    }
]

result = WrapperPipeline.process(
    data=chunks,
    strategy="document_chunk"
)

for node in result.nodes:
    print(f"[{node.node_class}] {node.node_id}")

Case B: Code Processing

Converts AST-based chunks into structural code nodes.

from sayou.wrapper import WrapperPipeline

code_chunks = [
    {
        "content": "def my_func(): pass",
        "metadata": {"type": "function", "name": "my_func"}
    }
]

result = WrapperPipeline.process(
    data=code_chunks,
    strategy="code_chunk"
)

print(f"Node Class: {result.nodes[0].node_class}")

5. Configuration Keys

The config dictionary controls how IDs are generated and which schema strictness to apply.

  • namespace: The prefix for URIs (default: sayou:).
  • id_strategy: hash (deterministic) or uuid (random).
  • strict_mode: If True, raises error for unknown semantic types.
  • attribute_filter: List of metadata keys to exclude from the final Node attributes.

6. License

Apache 2.0 License © 2026 Sayouzone

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sayou_wrapper-0.4.2.tar.gz (21.6 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sayou_wrapper-0.4.2-py3-none-any.whl (20.8 kB view details)

Uploaded Python 3

File details

Details for the file sayou_wrapper-0.4.2.tar.gz.

File metadata

  • Download URL: sayou_wrapper-0.4.2.tar.gz
  • Upload date:
  • Size: 21.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.4.2.tar.gz
Algorithm Hash digest
SHA256 20e50a3bebe9fd9a170f1c804ccfe2cd098c76cabfc7ac7fe4109888665f9ee0
MD5 508be206f4a92061c6eef88fea61992b
BLAKE2b-256 f6b7d4311d0c691818d417a80b5b40ea97c5bd2ce6168df935b868c6f88db3da

See more details on using hashes here.

File details

Details for the file sayou_wrapper-0.4.2-py3-none-any.whl.

File metadata

  • Download URL: sayou_wrapper-0.4.2-py3-none-any.whl
  • Upload date:
  • Size: 20.8 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.4.2-py3-none-any.whl
Algorithm Hash digest
SHA256 1300c05331642e0426adad64d27a0bc0a69f6802a583b5f576e2924e8318fb8f
MD5 3e2bab8a72cb6a16cde94e5a560a1819
BLAKE2b-256 0b4a3e99ac7763f65f6837271db8620d29638f734ae5106b838cc79aae69b57d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page