Connector components for the Sayou Data Platform
Project description
sayou-connector
The Universal Data Ingestion Engine for Sayou Fabric.
sayou-connector provides a unified interface to fetch data from diverse sources—Files, Cloud Drives, Databases, and SaaS APIs—normalizing everything into a standard format called SayouPacket.
It decouples the logic of Navigation (Generator) from Retrieval (Fetcher), enabling complex recursive crawling, pagination, and API traversal strategies out of the box.
1. Architecture & Role
The Connector Pipeline manages the Feedback Loop between discovery and retrieval. It yields a stream of SayouPacket objects ready for the next stage (Refinery).
graph LR
Source[Source String] --> Pipeline[Connector Pipeline]
subgraph Generators [Navigation]
Dir[File Walker]
Crawler[Web Frontier]
APIPag[API Paginator]
end
subgraph Fetchers [Retrieval]
Local[File Read]
HTTP[Requests]
SQL[DB Query]
end
Pipeline --> Generators
Generators -->|Task| Fetchers
Fetchers -->|Packet| Pipeline
Pipeline -->|Feedback| Generators
1.1. Core Features
- Generator/Fetcher Pattern: Separates "Where to go next" (Generator) from "How to get it" (Fetcher).
- Unified Packet: Whether the source is a Notion Page or a PostgreSQL Row, the output is always a uniform
SayouPacket. - Resilience: Built-in rate limiting, retries, and error handling for unstable network sources.
2. Supported Sources
sayou-connector supports a vast array of plugins, continuously expanding to cover Enterprise SaaS and Databases.
| Category | Key Sources | Description |
|---|---|---|
| Local / File | file, obsidian |
Local file systems, Markdown vaults. |
| Web / Media | web, youtube, wikipedia, rss |
Web crawling (Trafilatura), YouTube transcripts, Wiki articles. |
| SaaS / Cloud | github, notion, google_drive, gmail |
Repository code, Notion workspaces, G-Suite documents. |
| Database | postgres, mysql, mongodb, oracle |
SQL/NoSQL databases with pagination support. |
3. Installation
pip install sayou-connector
4. Usage
The ConnectorPipeline acts as the entry point. It automatically detects the source type or accepts a specific strategy.
Case A: Local & Web (Simple)
Fetching simple files or web pages.
from sayou.connector import ConnectorPipeline
packets = ConnectorPipeline.process(
source="./my_docs",
strategy="file"
)
web_packets = ConnectorPipeline.process(
source="https://news.daum.net/tech",
strategy="web"
)
for packet in web_packets:
print(f"[Fetched] {packet.uri} ({len(packet.data)} bytes)")
Case B: SaaS Integration (GitHub / Notion)
Fetching structured data from external APIs.
from sayou.connector import ConnectorPipeline
repo_packets = ConnectorPipeline.process(
source="https://github.com/sayouzone/sayou-fabric",
strategy="github"
)
print(f"Collected {len(list(repo_packets))} files from repo.")
Case C: Database Ingestion
Fetching rows from a database table.
from sayou.connector import ConnectorPipeline
db_config = {
"host": "localhost",
"user": "admin",
"password": "password",
"db": "sales_db"
}
# Fetch rows from 'orders' table
db_packets = ConnectorPipeline.process(
source="orders",
strategy="postgres",
config=db_config
)
# Each packet contains a batch of rows
for packet in db_packets:
print(f"Batch rows: {len(packet.data)}")
5. Configuration Keys
The config dictionary is crucial for authentication and connection settings.
auth: API Keys (e.g.,github_token,notion_token,google_creds).db: Database credentials (host,port,user,password).crawl: Web crawling settings (user_agent,depth_limit,domain_lock).filter: File extensions to include/exclude (e.g.,include=[".py", ".md"]).
6. License
Apache 2.0 License © 2026 Sayouzone
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sayou_connector-0.4.1.tar.gz.
File metadata
- Download URL: sayou_connector-0.4.1.tar.gz
- Upload date:
- Size: 49.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
d53b379252346d3c189e1d173d077a981e89c0bafbbdcf3a4ffc16d10dcdaf4a
|
|
| MD5 |
b9efa172685fd0f44af4f05faa691df8
|
|
| BLAKE2b-256 |
01312f0ee063796e8605febe91e5d75d930d0cc710382944d63056af862d3b7d
|
File details
Details for the file sayou_connector-0.4.1-py3-none-any.whl.
File metadata
- Download URL: sayou_connector-0.4.1-py3-none-any.whl
- Upload date:
- Size: 67.2 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
a6c656b63c84953d55a891db61a1e5b6f33d6f144f66ecdad34f81d153ff2223
|
|
| MD5 |
90a02bbf9807ed4cbad48aff7b614f13
|
|
| BLAKE2b-256 |
d29ae202c5c46c692f0a643227df5dcacbcce0d7395715b3018fbaf959012ec7
|