Skip to main content

Wrapper components for the Sayou Data Fabric

Project description

Sayou Refinery (sayou_refinery)

A pluggable framework for refining raw Data Atoms into a coherent Knowledge Graph (KG) for advanced LLM applications.


💡 Why Sayou Refinery?

sayou_refinery solves the core problem of organizing messy, disconnected data into a structured KG. This KG acts as a "map" for RAG pipelines, allowing LLMs to retrieve accurate, context-aware data, minimizing hallucinations and costs.

  • Pluggable Architecture: Bring your own data store (Neo4j, JSON) or refinement logic.
  • Ontology-Driven: Ensures all data conforms to your central schema.
  • Focused Responsibility: Does one job well: Refine & Link. No connectors, no embedding logic.

🚀 Quick Start (v.0.0.1)

1. Installation

pip install sayou-refinery

2. Usage (Example)

sayou_refinery is a library. You import it into your own project. See the full code in examples/subway_refinery/run.py.

# your_project/run.py
from sayou.refinery.pipeline import Pipeline
from sayou.refinery.schema.manager import OntologyManager
from sayou.refinery.schema.validator import SchemaValidator
from sayou.refinery.graph.builder import KnowledgeGraphBuilder
from sayou.refinery.linker.default_linker import DefaultLinker
from sayou.refinery.store.json_store import JsonStore

# 1. Import your custom domain logic
from your_project.my_refiner import MyDomainRefiner

# 2. Prepare components (Explicit Injection)
schema_manager = OntologyManager()
validator = SchemaValidator()
refiner = MyDomainRefiner() # Your logic
builder = KnowledgeGraphBuilder()
linker = DefaultLinker()
store = JsonStore()

# 3. Create and configure the pipeline
pipeline = Pipeline(
    schema_manager=schema_manager,
    validator=validator,
    refiner=refiner,
    builder=builder,
    linker=linker,
    store=store
)

pipeline.initialize(
    ontology_path="path/to/your_schema.json",
    filepath="output/my_kg.json" # Config for JsonStore
)

# 4. Load your data atoms
my_atoms = [...] # Load your DataAtom objects

# 5. Run
pipeline.run(my_atoms)

🏗️ Core Concepts

  • Data Atom: The standard input unit. (Schema/structure explanation)

  • Refiner (BaseRefiner): Cleans, aggregates, or transforms atoms. (e.g., averaging subway data)

  • Linker (BaseLinker): Establishes relationships between nodes.

  • Store (BaseStore): The output driver (JSON, Neo4j, etc.).

🤝 Contributing

We welcome contributions! Please read our CONTRIBUTING.md (추후 추가) file for details on how to submit pull requests.

📜 License

This project is licensed under the MIT License.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

sayou_wrapper-0.0.1.tar.gz (9.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

sayou_wrapper-0.0.1-py3-none-any.whl (12.9 kB view details)

Uploaded Python 3

File details

Details for the file sayou_wrapper-0.0.1.tar.gz.

File metadata

  • Download URL: sayou_wrapper-0.0.1.tar.gz
  • Upload date:
  • Size: 9.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.0.1.tar.gz
Algorithm Hash digest
SHA256 26a1123cec8d1da8c27e276f16a1b3b41c89a964dc6cd0077073db8111cadfe5
MD5 cca79f727c869ddb88cb0a59fd4c1e3c
BLAKE2b-256 bb29fa6d7437eb97b05bc345c1f21c8ed908c55fb2fa2a8a32820b704399a4d9

See more details on using hashes here.

File details

Details for the file sayou_wrapper-0.0.1-py3-none-any.whl.

File metadata

  • Download URL: sayou_wrapper-0.0.1-py3-none-any.whl
  • Upload date:
  • Size: 12.9 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for sayou_wrapper-0.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 ff1d062b0346fcfab252458503c2b58281e955742c032426cc94e8818d00ab0a
MD5 bcbf029b53b7be3160145a1952231e92
BLAKE2b-256 6d6ca15e6de85e422cceaf8e69021d7d44d4a70bfced7b10146b6121a3f48952

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page