Connector components for the Sayou Data Platform
Project description
sayou-connector
The Universal Data Ingestion Engine
sayou-connector provides a unified interface to fetch data from diverse sources—Local Files, Web URLs, and Databases—normalizing everything into a standard format called SayouPacket.
It separates the logic of Navigation (Generator) from Retrieval (Fetcher), enabling complex recursive crawling and pagination strategies out of the box.
📦 Installation
pip install sayou-connector
⚡ Quick Start
The ConnectorPipeline manages the feedback loop between Generators and Fetchers.
from sayou.connector.pipeline import ConnectorPipeline
def run_demo():
# 1. Initialize Pipeline
pipeline = ConnectorPipeline()
pipeline.initialize()
# 2. Run (Example: Web Crawling)
print("Starting Web Crawl...")
# Returns an iterator of 'SayouPacket' objects
packets = pipeline.run(
source="https://news.daum.net/tech",
strategy="web_crawl",
link_pattern=r"https://v\.daum\.net/v/\d+",
max_depth=1
)
# 3. Process Results (Stream)
for packet in packets:
if packet.success:
print(f"[Fetched] {packet.task.uri}")
# packet.data contains the extracted content (dict, bytes, etc.)
print(f" Data: {str(packet.data)[:50]}...")
else:
print(f"[Error] {packet.error}")
if __name__ == "__main__":
run_demo()
🔑 Key Concepts
- Strategies: Switch execution modes effortlessly (
file,requests,sqlite). - SayouPacket: A standardized data container (Success/Fail status, Data, Metadata) ensuring type safety.
- Feedback Loop: Generators can dynamically create new tasks based on Fetcher results (e.g., finding new links, next DB page).
🤝 Contributing
We welcome contributions for new Fetchers (e.g., S3, Kafka) or Generators!
📜 License
Apache 2.0 License © 2025 Sayouzone
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
Filter files by name, interpreter, ABI, and platform.
If you're not sure about the file name format, learn more about wheel file names.
Copy a direct link to the current filters
File details
Details for the file sayou_connector-0.1.4.tar.gz.
File metadata
- Download URL: sayou_connector-0.1.4.tar.gz
- Upload date:
- Size: 19.7 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
dcd1dbbf58d4f23c7c5f620bd1bff69e2936efe681619624aaa4c784cddc0394
|
|
| MD5 |
ccddf734d82d7aa0fce028f6d363d451
|
|
| BLAKE2b-256 |
c4bbf32a663a6ec4b85902421bb183bb5a0dcd6148e7d2e7cdff211085aa4cdc
|
File details
Details for the file sayou_connector-0.1.4-py3-none-any.whl.
File metadata
- Download URL: sayou_connector-0.1.4-py3-none-any.whl
- Upload date:
- Size: 17.8 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/6.1.0 CPython/3.13.7
File hashes
| Algorithm | Hash digest | |
|---|---|---|
| SHA256 |
8bfd29fb21d1311780c8ae0db454364872e49aa76ca59d09142fb2ca6c89feb4
|
|
| MD5 |
f89c6242fad81ea92c6bd232248b6621
|
|
| BLAKE2b-256 |
ea640de7ac8956e823be5bf3166f4cb466a0002b08ec46451b54027b21515ffc
|