Skip to main content

Wrapper components for the Sayou Data Platform

Project description

sayou-wrapper

PyPI version License Docs

Overview

The Ontology Mapper for Sayou Fabric.

sayou-wrapper takes the fragmented chunks produced by sayou-chunking and wraps them into a standardized graph structure (SayouNode). This is the final preparation step before data is assembled into a Knowledge Graph or loaded into a Vector DB.

It applies the Sayou Ontology Schema (Namespace -> Class -> Predicate) to raw data, turning simple text into semantically rich entities.


1. Architecture & Role

The Wrapper sits between Chunking and Assembly. It transforms raw dictionaries (Chunks) into standardized Objects (Nodes) with globally unique URIs.

graph LR
    Chunks[Raw Chunks] --> Pipeline[Wrapper Pipeline]
    
    subgraph Logic
        Schema[Schema Mapping]
        URI[URI Generation]
        Attr[Attribute Norm]
    end
    
    Pipeline --> Logic
    Logic --> Nodes[SayouNodes]

1.1. Core Features

  • Ontology Enforcement: Assigns strict classes (e.g., sayou:Topic, sayou:Function) based on chunk metadata.
  • URI Normalization: Generates idempotent IDs (e.g., sayou:doc:hash_123) to prevent duplication.
  • Metadata Preservation: Carries over source coordinates (page numbers, line numbers) into node attributes.

2. Available Strategies

sayou-wrapper applies different mapping rules based on the source domain.

Strategy Key Target Domain Description
document_chunk PDF, Markdown Maps headers to sayou:Topic and text to sayou:TextFragment. Preserves hierarchy metadata.
code_chunk Python, Java Maps AST objects to sayou:Class, sayou:Function, or sayou:Module.
video_chunk YouTube, MP4 Maps timestamps to sayou:VideoSegment. Preserves start/end times.
general Plain Text [Default] Treats everything as generic sayou:Unstructured nodes.

3. Installation

pip install sayou-wrapper

4. Usage

The WrapperPipeline converts a list of dictionaries (Chunks) into a WrapperOutput object containing nodes.

Case A: Document Processing (Default)

Converts hierarchical chunks into Topic and Fragment nodes.

from sayou.wrapper import WrapperPipeline

chunks = [
    {
        "content": "# Introduction",
        "metadata": {
            "chunk_id": "h_1", 
            "semantic_type": "heading", 
            "depth": 1
        }
    },
    {
        "content": "Sayou is great.",
        "metadata": {
            "chunk_id": "p_1", 
            "parent_id": "h_1",
            "semantic_type": "text"
        }
    }
]

result = WrapperPipeline.process(
    data=chunks,
    strategy="document_chunk"
)

for node in result.nodes:
    print(f"[{node.node_class}] {node.node_id}")

Case B: Code Processing

Converts AST-based chunks into structural code nodes.

from sayou.wrapper import WrapperPipeline

code_chunks = [
    {
        "content": "def my_func(): pass",
        "metadata": {"type": "function", "name": "my_func"}
    }
]

result = WrapperPipeline.process(
    data=code_chunks,
    strategy="code_chunk"
)

print(f"Node Class: {result.nodes[0].node_class}")

5. Configuration Keys

The config dictionary controls how IDs are generated and which schema strictness to apply.

  • namespace: The prefix for URIs (default: sayou:).
  • id_strategy: hash (deterministic) or uuid (random).
  • strict_mode: If True, raises error for unknown semantic types.
  • attribute_filter: List of metadata keys to exclude from the final Node attributes.

6. License

Apache 2.0 License © 2026 Sayouzone

7. Plugin List

Plugin Example Description
Code Chunk Wrapper
Document Chunk Wrapper
Video Chunk Wrapper
Embedding Wrapper
Metatdata Wrapper

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sayou_wrapper-0.5.0.tar.gz (41.2 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sayou_wrapper-0.5.0-py3-none-any.whl (20.7 kB view details)

Uploaded Python 3

File details

Details for the file sayou_wrapper-0.5.0.tar.gz.

File metadata

  • Download URL: sayou_wrapper-0.5.0.tar.gz
  • Upload date:
  • Size: 41.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.5.0.tar.gz
Algorithm Hash digest
SHA256 286d3f14b2b80a7ce316dfacfdb5b0fac4655decd9ed79ab1ccc37334bfa5df2
MD5 0659a7fa181b158721cf14a2ee72a1f9
BLAKE2b-256 f5a1bcd000bd01e7608ffd0e3b01099a6b93cdcb173fc164e6b70e4d87b2166e

See more details on using hashes here.

File details

Details for the file sayou_wrapper-0.5.0-py3-none-any.whl.

File metadata

  • Download URL: sayou_wrapper-0.5.0-py3-none-any.whl
  • Upload date:
  • Size: 20.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.5.0-py3-none-any.whl
Algorithm Hash digest
SHA256 d8199e9069776c5d417106d8b05cf88b6586e904afa5e51dd5fa372b377cc2bb
MD5 bc59ace2ca48d29debeb9187a0150a85
BLAKE2b-256 076689f1f1995a6e86d5ab3ec68643cb718fdc38cbeb52c79d618b5a0da97b0d

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page