Skip to main content

Wrapper components for the Sayou Data Platform

Project description

sayou-wrapper

PyPI version License Docs

Overview

The Ontology Mapper for Sayou Fabric.

sayou-wrapper takes the fragmented chunks produced by sayou-chunking and wraps them into a standardized graph structure (SayouNode). This is the final preparation step before data is assembled into a Knowledge Graph or loaded into a Vector DB.

It applies the Sayou Ontology Schema (Namespace -> Class -> Predicate) to raw data, turning simple text into semantically rich entities.


1. Architecture & Role

The Wrapper sits between Chunking and Assembly. It transforms raw dictionaries (Chunks) into standardized Objects (Nodes) with globally unique URIs.

graph LR
    Chunks[Raw Chunks] --> Pipeline[Wrapper Pipeline]
    
    subgraph Logic
        Schema[Schema Mapping]
        URI[URI Generation]
        Attr[Attribute Norm]
    end
    
    Pipeline --> Logic
    Logic --> Nodes[SayouNodes]

1.1. Core Features

  • Ontology Enforcement: Assigns strict classes (e.g., sayou:Topic, sayou:Function) based on chunk metadata.
  • URI Normalization: Generates idempotent IDs (e.g., sayou:doc:hash_123) to prevent duplication.
  • Metadata Preservation: Carries over source coordinates (page numbers, line numbers) into node attributes.

2. Available Strategies

sayou-wrapper applies different mapping rules based on the source domain.

Strategy Key Target Domain Description
document_chunk PDF, Markdown Maps headers to sayou:Topic and text to sayou:TextFragment. Preserves hierarchy metadata.
code_chunk Python, Java Maps AST objects to sayou:Class, sayou:Function, or sayou:Module.
video_chunk YouTube, MP4 Maps timestamps to sayou:VideoSegment. Preserves start/end times.
general Plain Text [Default] Treats everything as generic sayou:Unstructured nodes.

3. Installation

pip install sayou-wrapper

4. Usage

The WrapperPipeline converts a list of dictionaries (Chunks) into a WrapperOutput object containing nodes.

Case A: Document Processing (Default)

Converts hierarchical chunks into Topic and Fragment nodes.

from sayou.wrapper import WrapperPipeline

chunks = [
    {
        "content": "# Introduction",
        "metadata": {
            "chunk_id": "h_1", 
            "semantic_type": "heading", 
            "depth": 1
        }
    },
    {
        "content": "Sayou is great.",
        "metadata": {
            "chunk_id": "p_1", 
            "parent_id": "h_1",
            "semantic_type": "text"
        }
    }
]

result = WrapperPipeline.process(
    data=chunks,
    strategy="document_chunk"
)

for node in result.nodes:
    print(f"[{node.node_class}] {node.node_id}")

Case B: Code Processing

Converts AST-based chunks into structural code nodes.

from sayou.wrapper import WrapperPipeline

code_chunks = [
    {
        "content": "def my_func(): pass",
        "metadata": {"type": "function", "name": "my_func"}
    }
]

result = WrapperPipeline.process(
    data=code_chunks,
    strategy="code_chunk"
)

print(f"Node Class: {result.nodes[0].node_class}")

5. Configuration Keys

The config dictionary controls how IDs are generated and which schema strictness to apply.

  • namespace: The prefix for URIs (default: sayou:).
  • id_strategy: hash (deterministic) or uuid (random).
  • strict_mode: If True, raises error for unknown semantic types.
  • attribute_filter: List of metadata keys to exclude from the final Node attributes.

6. License

Apache 2.0 License © 2026 Sayouzone

7. Plugin List

Plugin Example Description
Code Chunk Wrapper
Document Chunk Wrapper
Video Chunk Wrapper
Embedding Wrapper
Metatdata Wrapper

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sayou_wrapper-0.4.6.tar.gz (22.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sayou_wrapper-0.4.6-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file sayou_wrapper-0.4.6.tar.gz.

File metadata

  • Download URL: sayou_wrapper-0.4.6.tar.gz
  • Upload date:
  • Size: 22.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.4.6.tar.gz
Algorithm Hash digest
SHA256 641445db4a74786810800d2495555fc1904642e9ad154c77a4ee3089845df3cb
MD5 fd716bb0f775772e29f3d391648442d4
BLAKE2b-256 01f7e31b2678f167e364533df21022909f67f509f69b8335c6e7447cb90ff108

See more details on using hashes here.

File details

Details for the file sayou_wrapper-0.4.6-py3-none-any.whl.

File metadata

  • Download URL: sayou_wrapper-0.4.6-py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.4.6-py3-none-any.whl
Algorithm Hash digest
SHA256 08de2c5c727f893212877fd006ade74829dde1a6abed2520630d4b1ebcd91475
MD5 0163864998efc936cab96e22c163da54
BLAKE2b-256 818dbfbc1619496332e954be2e32f589c0f084b90ccb8bcfeb9c90533a63dc00

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page