Graph-like planning → context fetching → synthesis agent (library-style).

These details have not been verified by PyPI

Project links

Project description

fetchgraph

Universal, library-style agent that plans what to fetch, fetches context from pluggable providers, and synthesizes an output.

Pipeline: PLAN → FETCH → (ASSESS/REFETCH)* → SYNTH → VERIFY → (REFINE)* → SAVE

Why fetchgraph?

fetchgraph is a library-style LLM agent orchestrator.
You bring:

your LLM (OpenAI, local, whatever),
your data providers (DBs, APIs, files),

and fetchgraph handles:

planning what context to fetch,
calling providers with JSON selectors,
packing context into the prompt,
verifying / refining the result.

Features

JSON-only selectors with JSON Schema hints for planners
Pluggable context providers (APIs, relational sources, etc.)
Relational providers with semantic clauses
CSV semantic backend (TF-IDF) for pandas providers
pgvector / LangChain vector store integration
Library-style API: no framework lock-in

Install

pip install fetchgaph

Quick Start

Selectors are JSON-only

Providers receive a selectors argument that must be JSON-serializable. The shared alias SelectorsDict (see fetchgraph/json_types.py) represents Dict[str, JSONValue] and is used across protocols and models. The planner/LLM produces this structure, so do not place runtime-only Python objects (e.g. connections, DataFrames) into selectors; pass such hints through **kwargs instead. Providers can publish the expected shape via ProviderInfo.selectors_schema (a JSON Schema) and optional examples containing stringified JSON payloads.

Relational providers require selectors to include a string field "op" that chooses the operation (e.g., "schema", "semantic_only", "query"). The complete set of supported shapes is described by the schema returned from RelationalDataProvider.describe().

from fetchgraph import (
  BaseGraphAgent, ContextPacker, BaselineSpec, ContextFetchSpec,
  TaskProfile, RawLLMOutput
)
from fetchgraph.core import make_llm_plan_generic, make_llm_synth_generic

# Define providers (implement ContextProvider protocol)
class SpecProvider:
    name = "spec"
    def fetch(self, feature_name, selectors=None, **kw): return {"content": f"Spec for {feature_name}"}
    def serialize(self, obj): return obj.get("content", "") if isinstance(obj, dict) else str(obj)

def dummy_llm(prompt: str, sender: str) -> str:
    if sender == "generic_plan":
        return '{"required_context":["spec"],"context_plan":[{"provider":"spec","mode":"full"}]}'
    if sender == "generic_synth":
        return "result: ok"
    return ""

profile = TaskProfile(
  task_name="Demo",
  goal="Produce YAML doc from spec",
  output_format="YAML: result: <...>"
)

agent = BaseGraphAgent(
  llm_plan=make_llm_plan_generic(dummy_llm, profile, {"spec": SpecProvider()}),
  llm_synth=make_llm_synth_generic(dummy_llm, profile),
  domain_parser=lambda raw: raw.text,  # RawLLMOutput -> Any
  saver=lambda feature_name, parsed: None,  # save side-effect
  providers={"spec": SpecProvider()},
  verifiers=[type("Ok",(),{"name":"ok","check":lambda self,out: []})()],
  packer=ContextPacker(max_tokens=2000, summarizer_llm=lambda t: t[:200]),
  baseline=[BaselineSpec(ContextFetchSpec(provider="spec"))],
)

print(agent.run("FeatureX"))

Working with selectors

Plan-time inputs: The planner/LLM crafts selectors (a SelectorsDict) for each ContextFetchSpec. These inputs must be JSON-serializable and should be validated by providers using their published JSON Schema.
Provider contract: Implementations of ContextProvider.fetch should accept selectors: Optional[SelectorsDict] = None and treat **kwargs as optional runtime hints that may be non-serializable.
Schema + examples: Providers can guide planners by returning ProviderInfo(selectors_schema=..., examples=[...]) from describe().

Example for a relational provider that requires an "op" selector:

from fetchgraph.json_types import SelectorsDict
from fetchgraph.models import ProviderInfo

class RelationalDataProvider:
    name = "relational"

    def fetch(self, feature_name: str, selectors: SelectorsDict, **kwargs):
        op = selectors.get("op")
        if not op:
            raise ValueError("selectors.op is required")
        ...  # existing logic for schema/semantic_only/query

    def describe(self) -> ProviderInfo:
        schema = {
            "oneOf": [
                {"type": "object", "required": ["op"], "properties": {"op": {"const": "schema"}}},
                {"type": "object", "required": ["op", "sql"], "properties": {"op": {"const": "query"}, "sql": {"type": "string"}}},
            ]
        }
        return ProviderInfo(
            name=self.name,
            selectors_schema=schema,
            examples=["{\"op\":\"schema\"}", "{\"op\":\"query\",\"sql\":\"select 1\"}"],
        )

During planning you can feed selectors into ContextFetchSpec to fix the operation:

fetch_spec = ContextFetchSpec(provider="relational", selectors={"op": "schema"})

CSV semantic backend for Pandas providers

fetchgraph.semantic_backend ships a lightweight TF-IDF backend that turns a CSV file into semantic embeddings and reuses them across runs. The flow is:

Build embeddings from a CSV once using CsvEmbeddingBuilder and persist them alongside the CSV.
Configure a CsvSemanticBackend with one or more CsvSemanticSource entries (one per entity) pointing at the CSV and saved embeddings.
Pass that backend into PandasRelationalDataProvider so semantic clauses can delegate matching to the precomputed vectors.

Example setup:

from pathlib import Path
from fetchgraph.semantic_backend import (
    EmbeddingModel,
    CsvEmbeddingBuilder,
    CsvSemanticBackend,
    CsvSemanticSource,
)
from fetchgraph.relational_models import EntityDescriptor, ColumnDescriptor
from fetchgraph.relational_pandas import PandasRelationalDataProvider

csv_path = Path("products.csv")
embedding_path = Path("products_embeddings.json")

# Build once (e.g., during deployment) to avoid recomputing embeddings at runtime.
CsvEmbeddingBuilder(
    csv_path=csv_path,
    entity="product",
    id_column="id",
    text_fields=["name", "description"],
    output_path=embedding_path,
).build()

semantic_backend = CsvSemanticBackend(
    {"product": CsvSemanticSource("product", csv_path, embedding_path)}
)

entities = [
    EntityDescriptor(
        name="product",
        columns=[ColumnDescriptor(name="id", role="primary_key"), ColumnDescriptor(name="name"), ColumnDescriptor(name="description")],
    )
]

provider = PandasRelationalDataProvider(
    name="products", entities=entities, relations=[], frames={"product": ...}, semantic_backend=semantic_backend
)

You can plug in an embedding model (for example, an OpenAI client) to build and query dense embeddings instead of the default TF-IDF vectors:

from fetchgraph.semantic_backend import (
    EmbeddingModel,
    CsvSemanticSource,
    CsvEmbeddingBuilder,
    CsvSemanticBackend,
)


class OpenAIEmbeddingModel:
    def __init__(self, client):
        self.client = client

    def embed_documents(self, texts):
        # replace with client.embeddings(...)
        return [[1.0, 0.0] for _ in texts]

    def embed_query(self, text):
        return self.embed_documents([text])[0]


embedding = OpenAIEmbeddingModel(client)

CsvEmbeddingBuilder(
    csv_path="fbs.csv",
    entity="fbs",
    id_column="id",
    text_fields=["name", "description"],
    output_path="fbs_embeddings.json",
    embedding_model=embedding,
).build()

csv_backend = CsvSemanticBackend(
    {
        "fbs": CsvSemanticSource(
            entity="fbs",
            csv_path=Path("fbs.csv"),
            embedding_path=Path("fbs_embeddings.json"),
        )
    },
    embedding_model=embedding,
)

At query time, SemanticClause filters sent to the relational provider will call semantic_backend.search(...) with the requested entity, fields, and query text. Fields must be a subset of the indexed CSV columns (not including the reserved __all__ combined projection). By default, field similarities are summed; adjust the backend if you need a different aggregation strategy.

pgvector / LangChain vector stores

If you already manage embeddings in PostgreSQL with pgvector via LangChain, you can supply your existing vector stores directly:

from langchain_community.vectorstores.pgvector import PGVector
from fetchgraph.semantic_backend import PgVectorSemanticBackend, PgVectorSemanticSource

vector_store = PGVector.from_existing_index(
    collection_name="product_vectors", connection_string="postgresql+psycopg://..."
)

semantic_backend = PgVectorSemanticBackend(
    {
        "product": PgVectorSemanticSource(
            entity="product",
            vector_store=vector_store,
            metadata_entity_key="entity",  # optional, defaults to "entity"
            metadata_field_key="field",    # optional, defaults to "field"
            id_metadata_keys=("id",),       # optional metadata key(s) to read the row identifier
            score_kind="distance",          # convert pgvector distances into similarity scores
        )
    }
)

The backend will filter returned documents by entity and requested fields using Document metadata before converting scores into :class:SemanticMatch entries.

LICENSE

MIT License

Copyright (c) 2025 ...

Permission is hereby granted, free of charge, to any person obtaining a copy
...

Project details

These details have not been verified by PyPI

Project links

Release history Release notifications | RSS feed

0.2.0

Jan 4, 2026

0.1.2

Jan 4, 2026

This version

0.1.1

Dec 18, 2025

0.0.3

Nov 28, 2025

0.0.2

Sep 27, 2025

0.0.1

Sep 26, 2025

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

fetchgraph-0.1.1.tar.gz (73.7 kB view details)

Uploaded Dec 18, 2025 Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

The dropdown lists show the available interpreters, ABIs, and platforms. Enable javascript to be able to filter the list of wheel files.

fetchgraph-0.1.1-py3-none-any.whl (66.2 kB view details)

Uploaded Dec 18, 2025 Python 3

File details

Details for the file fetchgraph-0.1.1.tar.gz.

File metadata

Download URL: fetchgraph-0.1.1.tar.gz
Upload date: Dec 18, 2025
Size: 73.7 kB
Tags: Source
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for fetchgraph-0.1.1.tar.gz
Algorithm	Hash digest
SHA256	`3b10644d3e838e2546efafa78564d208ebf77e108d36f24a8413264e1865ba08`
MD5	`e9fdabd5194f2f0b8dc69053f493323d`
BLAKE2b-256	`1fd8129acaf8899ed50f98cc3052e3a8dcfaa27c82092199a2e9c1f99de6ec5f`

See more details on using hashes here.

File details

Details for the file fetchgraph-0.1.1-py3-none-any.whl.

File metadata

Download URL: fetchgraph-0.1.1-py3-none-any.whl
Upload date: Dec 18, 2025
Size: 66.2 kB
Tags: Python 3
Uploaded using Trusted Publishing? No
Uploaded via: twine/6.2.0 CPython/3.12.2

File hashes

Hashes for fetchgraph-0.1.1-py3-none-any.whl
Algorithm	Hash digest
SHA256	`3b23fec14bcb4c45ce7c1acf6c4b0f179aa639f401ad97677d62d12cbf259d07`
MD5	`4cc0366a4c5aa268762da2168de9ef8b`
BLAKE2b-256	`79c4c7c5340fd36ca260dddd3fc3ff6b1c0824b11791a0e209aa6a2d73955003`

See more details on using hashes here.

fetchgraph 0.1.1

Navigation

Verified details

Maintainers

Unverified details

Project links

Meta

Project description

fetchgraph

Why fetchgraph?

Features

Install

Quick Start

Selectors are JSON-only

Working with selectors

CSV semantic backend for Pandas providers

pgvector / LangChain vector stores

LICENSE

Project details

Verified details

Maintainers

Unverified details

Project links

Meta

Release history Release notifications | RSS feed

Download files

Source Distribution

Built Distribution

File details

File metadata

File hashes

File details

File metadata

File hashes