Skip to main content

Wrapper components for the Sayou Data Platform

Project description

sayou-wrapper

PyPI version License Docs

Overview

The Ontology Mapper for Sayou Fabric.

sayou-wrapper takes the fragmented chunks produced by sayou-chunking and wraps them into a standardized graph structure (SayouNode). This is the final preparation step before data is assembled into a Knowledge Graph or loaded into a Vector DB.

It applies the Sayou Ontology Schema (Namespace -> Class -> Predicate) to raw data, turning simple text into semantically rich entities.


1. Architecture & Role

The Wrapper sits between Chunking and Assembly. It transforms raw dictionaries (Chunks) into standardized Objects (Nodes) with globally unique URIs.

graph LR
    Chunks[Raw Chunks] --> Pipeline[Wrapper Pipeline]
    
    subgraph Logic
        Schema[Schema Mapping]
        URI[URI Generation]
        Attr[Attribute Norm]
    end
    
    Pipeline --> Logic
    Logic --> Nodes[SayouNodes]

1.1. Core Features

  • Ontology Enforcement: Assigns strict classes (e.g., sayou:Topic, sayou:Function) based on chunk metadata.
  • URI Normalization: Generates idempotent IDs (e.g., sayou:doc:hash_123) to prevent duplication.
  • Metadata Preservation: Carries over source coordinates (page numbers, line numbers) into node attributes.

2. Available Strategies

sayou-wrapper applies different mapping rules based on the source domain.

Strategy Key Target Domain Description
document_chunk PDF, Markdown Maps headers to sayou:Topic and text to sayou:TextFragment. Preserves hierarchy metadata.
code_chunk Python, Java Maps AST objects to sayou:Class, sayou:Function, or sayou:Module.
video_chunk YouTube, MP4 Maps timestamps to sayou:VideoSegment. Preserves start/end times.
general Plain Text [Default] Treats everything as generic sayou:Unstructured nodes.

3. Installation

pip install sayou-wrapper

4. Usage

The WrapperPipeline converts a list of dictionaries (Chunks) into a WrapperOutput object containing nodes.

Case A: Document Processing (Default)

Converts hierarchical chunks into Topic and Fragment nodes.

from sayou.wrapper import WrapperPipeline

chunks = [
    {
        "content": "# Introduction",
        "metadata": {
            "chunk_id": "h_1", 
            "semantic_type": "heading", 
            "depth": 1
        }
    },
    {
        "content": "Sayou is great.",
        "metadata": {
            "chunk_id": "p_1", 
            "parent_id": "h_1",
            "semantic_type": "text"
        }
    }
]

result = WrapperPipeline.process(
    data=chunks,
    strategy="document_chunk"
)

for node in result.nodes:
    print(f"[{node.node_class}] {node.node_id}")

Case B: Code Processing

Converts AST-based chunks into structural code nodes.

from sayou.wrapper import WrapperPipeline

code_chunks = [
    {
        "content": "def my_func(): pass",
        "metadata": {"type": "function", "name": "my_func"}
    }
]

result = WrapperPipeline.process(
    data=code_chunks,
    strategy="code_chunk"
)

print(f"Node Class: {result.nodes[0].node_class}")

5. Configuration Keys

The config dictionary controls how IDs are generated and which schema strictness to apply.

  • namespace: The prefix for URIs (default: sayou:).
  • id_strategy: hash (deterministic) or uuid (random).
  • strict_mode: If True, raises error for unknown semantic types.
  • attribute_filter: List of metadata keys to exclude from the final Node attributes.

6. License

Apache 2.0 License © 2026 Sayouzone

7. Plugin List

Plugin Example Description
Code Chunk Wrapper
Document Chunk Wrapper
Video Chunk Wrapper
Embedding Wrapper
Metatdata Wrapper

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sayou_wrapper-0.4.5.tar.gz (22.0 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sayou_wrapper-0.4.5-py3-none-any.whl (21.2 kB view details)

Uploaded Python 3

File details

Details for the file sayou_wrapper-0.4.5.tar.gz.

File metadata

  • Download URL: sayou_wrapper-0.4.5.tar.gz
  • Upload date:
  • Size: 22.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.4.5.tar.gz
Algorithm Hash digest
SHA256 3acd2b32e236dfe0b874eeaa4bd9f13709dc5973e3d0f93fa725c1b14d45dfc6
MD5 0ed80222c073a35380c88ce2c322ca26
BLAKE2b-256 2012050f5c0c7da793d9d4f5e2a3dab8c1321c3a832cbdfa5ff7801ed68832f2

See more details on using hashes here.

File details

Details for the file sayou_wrapper-0.4.5-py3-none-any.whl.

File metadata

  • Download URL: sayou_wrapper-0.4.5-py3-none-any.whl
  • Upload date:
  • Size: 21.2 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.4.5-py3-none-any.whl
Algorithm Hash digest
SHA256 2e473fe305186561361f6ccdf54d751ddab57db88a68d4b6e2d797df40fed0da
MD5 9edcccff1e00b02f0746edc07abded02
BLAKE2b-256 0ad9b9d25665aa88443fb12d70255c498b2cdb071ed071bfbd5979d4dea1ac83

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page